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ENZYMATIC DNA MOLECULES 



TECHNICAL FIELD 

The present invention relates to nucleic acid enzymes or catalytic (enzymatic) 
5 DNA molecules that are capable of cleaving other nucleic acid molecules, particularly 

RNA. The present invention also relates to compositions containing the disclosed 
enzymatic DNA molecules and to methods of making and using such enzymes and 
compositions. 

BACKGROUND 

10 The need for catalysts that operate outside of their native context or which 

catalyze reactions that are not represented in nature has resulted in the development of 
"enzyme engineering" technology. The usual route taken in enzyme engineering has 
been a "rational design" approach, relying upon the understanding of natural enzymes to 
aid in the construction of new enzymes. Unfortunately, the state of proficiency in the 

1 5 areas of protein structure and chemistry is insufficient to make the generation of novel 

biological catalysts routine. 

Recently, a different approach for developing novel catalysts has been applied. 
This method involves the construction of a heterogeneous pool of macromolecules and 
the application of an in vitro selection procedure to isolate molecules from the pool that 

20 catalyze the desired reaction. Selecting catalysts from a pool of macromolecules is not 

dependent on a comprehensive understanding of their structural and chemical 
properties. Accordingly, this process has been dubbed "irrational design" (Brenner and 
Lerner, PNAS USA 89 : 5381-5383 (1992)). 

Most efforts to date involving the rational design of enzymatic RNA molecules or 

25 ribozymes have not led to molecules with fundamentally new or improved catalytic 

function. However, the application of irrational design methods via a process we have 
described as "directed molecular evolution" or "in vitro evolution", which is patterned 
after Darwinian evolution of organisms in nature, has the potential to lead to the 
production of DNA molecules that have desirable functional characteristics. 

30 This technique has been applied with varying degrees of success to RNA 

molecules in solution (see, e.g.. Mills, et al., PNAS l*SA 58: 217 (1967); Green, et al.. 
Nature 347 : 406 (1990); Chowrira, et al., Nature 354: 320 (1991); Joyce, Qen g 32 : 83 

(1989) ; Beaudry and Joyce, Science 257 : 635-641 (1992); Robertson and Joyce, 
Nature 344 : 467 (1990)), as well as to RNAs bound to a ligand that is attached to a 

35 solid support (Tuerk, et al., Snience 249 : 505 (1990); Ellington, et al., Nfiturg 34? : 818 

(1990) ). It has also been applied to peptides attached directly to a solid support (Lam, 
et al.. Nature 354 : 82 (1991)); and to peptide epitopes expressed within a viral coat 
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protein (Scott, et at,, Science 249: 386 (1990); Devlin, et al., Science 249 : 249 (1990); 
Cwirla, et al., PNAS USA 87 : 6378 (1990)). 

It has been more than a decade since the discovery of catalytic RNA (Kruger, et 
al., Ce|[ 31: 147-157 (1982); Guerrier-Takada, et al., Cell 35 : 849-857 (1983)). The list 
of known naturally-occurring ribozymes continues to grow (see Cech, in The RNA World . 
Gesteland & Atkins (eds.), pp. 239-269, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY (1993); Pyle, Science 261 : 709-714 (1993); Symons, Curr. Qoin. 
Struct, giol, 4: 322-330 (1994)) and, in recent years, has been augmented by synthetic 
ribozymes obtained through in vitro evolution. (See, e.g., Joyce, Curr. Qoin. Struct. 
BiQl. 4: 331-336 (1994); Breaker & Joyce, Trends Biotech. 12 : 268-275 (1994); 
Chapman & Szostak, Curr. Qptn. Struct. Biol. 4 : 618-622 (1994).) 

It seems reasonable to assume that DNA can have catalytic activity as well, 
considering that it contains most of the same functional groups as RNA. However, with 
the exception of certain viral genomes and replication intermediates, nearly all of the 
DNA in biological organisms occurs as a complete duplex, precluding it from adopting a 
complex secondary and tertiary structure. Thus it is not surprising that DNA enzymes 
have not been found in nature. 

Until the advent of the present invention, the design, synthesis and use of 
catalytic DNA molecules with nucleotide-cleaving capabilities has not been disclosed or 
demonstrated. Therefore, the discoveries and inventions disclosed herein are 
particularly significant, in that they highlight the potential of in vitro evolution as a 
means of designing increasingly more efficient catalytic molecules, including enzymatic 
DNA molecules that cleave other nucleic acids, particularly RNA. 

BRIEF SUMMA RY OF THE INVENTION 

The present invention thus contemplates a synthetic or engineered (i.e., non- 
naturally-occurring) catalytic DNA molecule (or enzymatic DNA molecule) capable of 
cleaving a substrate nucleic acid (NA) sequence at a defined cleavage site. The 
invention also contemplates an enzymatic DNA molecule having an endonuclease 
activity. 

In one preferred variation, the endonuclease activity is specific for a nucleotide 
sequence defining a cleavage site comprising single-stranded nucleic acid in a substrate 
nucleic acid sequence. In another preferred variation, the cleavage site is double- 
stranded nucleic acid. Similarly, substrate nucleic acid sequences may be single- 
stranded, double-stranded, partially single- or double-stranded, looped, or any 
combination thereof. 

In another contemplated embodiment, the substrate nucleic acid sequence 
includes one or more nucleotide analogues. In one variation, the substrate nucleic acid 
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sequence is a portion of, or attached to, a larger molecule. 

In various embodiments, the larger molecule is selected from the group 
consisting of RNA, modified RNA, DNA, modified DNA, nucleotide analogs, or 
composites thereof. In another example, the larger molecule comprises a composite of a 
5 nucleic acid sequence and a non-nucleic acid sequence. 

In another embodiment, the invention contemplates that a substrate nucleic acid 
sequence includes one or more nucleotide analogs. A further variation contemplates 
that the single stranded nucleic acid comprises RNA, DNA, modified RNA, modified 
DNA, one or more nucleotide analogs, or any composite thereof. In one embodiment of 
10 the disclosed invention, the endonuclease activity comprises hydrolytic cleavage of a 

phosphoester bond at the cleavage site. 

In various preferred embodiments, the catalytic DNA molecules of the present 
invention are single-stranded in whole or in part. These catalytic DNA molecules may 
preferably assume a variety of shapes consistent with their catalytic activity. Thus, in 
1 5 one variation, a catalytic DNA molecule of the present invention includes one or more , 

hairpin loop structures. In yet another variation, a catalytic DNA molecule may assume 
a shape similar to that of "hammerhead" ribozymes. In still other embodiments, a 
catalytic DNA molecule may assume a conformation similar to that of Tetrahymena 
thermophila ribozymes, e.g., those derived from group I introns. ^ 
20 Similarly, preferred catalytic DNA molecules of the present invention are able to 

demonstrate site-specific endonuclease activity irrespective of the original orientation of 
the substrate molecule. Thus, in one preferred embodiment, an enzymatic DNA 
molecule of the present invention is able to cleave a substrate nucleic acid sequence 
that is separate from the enzymatic DNA molecule i.e., it is not linked to the 
25 DNAzyme. In another preferred embodiment, an enzymatic DNA molecule is able to 

cleave an attached substrate nucleic acid sequence -- i.e., it is able to perform a reaction 
similar to self-cleavage. 

The invention also contemplates enzymatic DNA molecules (catalytic DNA 
molecules, deoxyribozymes or DNAzymes) having endonuclease activity, whereby the 
30 endonuclease activity requires the presence of a divalent cation. In various preferred, 

alternative embodiments, the divalent cation is selected from the group consisting of 
Pb 2 *, Mg 2 *, Mn 2 + , Zn 2 + , and Ca J *. Another variation contemplates that the 
endonuclease activity requires the presence of a monovalent cation. In such alternative 
embodiments, the monovalent cation is preferably selected from the group consisting of 

35 Na + and K*. 

In various preferred embodiments of the invention, an enzymatic DNA molecule 
comprises a nucleotide sequence selected from the group consisting of SEQ ID NO 3, 
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SEQ ID NO 14; SEQ ID NO 15; SEQ ID NO 16; SEQ ID NO 17; SEQ ID NO 18; SEQ ID 
NO 19; SEQ ID NO 20; SEQ ID NO 21; and SEQ ID NO 22. tn other preferred 
embodiments, a catalytic DNA molecule of the present invention comprises a nucleotide 
sequence selected from the group consisting of SEQ ID NO 23; SEQ ID NO 24; SEQ ID 
NO 25; SEQ ID NO 26; SEQ ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ ID NO 30; 
SEQ ID NO 31; SEQ ID NO 32; SEQ ID NO 33; SEQ ID NO 34; SEQ ID NO 35; SEQ ID 
NO 36; SEQ ID NO 37; SEQ ID NO 38; and SEQ ID NO 39. 

Another preferred embodiment contemplates that a catalytic DNA molecule of 
the present invention comprises a nucleotide sequence selected from the group 
consisting of SEQ ID NO 50 and SEQ ID NO 51 . In yet another preferred embodiment, a 
catalytic DNA molecule of the present invention comprises a nucleotide sequence 
selected from the group consisting of SEQ ID NOS 52 through 101. As disclosed 
herein, catalytic DNA molecules having sequences substantially similar to those 
disclosed herein are also contemplated. Thus, a wide variety of substitutions, deletions, 
insertions, duplications and other mutations may be made to the within-described 
molecules in order to generate a variety of other useful enzymatic DNA molecules; as 
long as said molecules display site-specific cleavage activity as disclosed herein, they 
are within the boundaries of this disclosure. 

In a further variation of the present invention, an enzymatic DNA molecule of the 
present invention preferably has a substrate binding affinity of about 1 //M or less. In 
another embodiment, an enzymatic DNA molecule of the present invention binds 
substrate with a K 0 of less than about 0.1 //M. 

The present invention also discloses enzymatic DNA molecules having useful 
turnover rates. In one embodiment, the turnover rate is less than 5 hr 1 ; in a preferred 
embodiment, the rate is less than about 2 hr 1 ; in a more preferred embodiment, the rate 
is less than about Ihr" 1 ; in an even more preferred embodiment, the turnover rate is 
about 0.6 hr 1 or less. 

In still another embodiment, an enzymatic DNA molecule of the present 
invention displays a useful turnover rate wherein the k obs is less than 1 min*\ preferably 
less than 0.1 min' 1 ; more preferably, less than 0.01 min" 1 ; and even more preferably, 
less than 0.005 min* 1 . In one variation, the value of k obs is approximately 0.002 min ' or 
less. 

The present invention also contemplates embodiments in which the catalytic rate 
of the disclosed DNA enzymes is fully optimized. Thus, in various preferred 
embodiments, the K m for reactions enhanced by the presence of Mg 2+ is approximately 
0.5-20 mM, preferably about 1-10 mM, and more preferably about 2-5 mM. 

The present invention also contemplates an embodiment whereby the nucleotide 
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sequence defining the cleavage site comprises at least one nucleotide. In various other 
preferred embodiments, a catalytic DNA molecule of the present invention is able to 
recognize and cleave a nucleotide sequence defining a cleavage site of two or more 
nucleotides. 

5 In various preferred embodiments, an enzymatic DNA molecule of the present 

invention comprises a conserved core flanked by one or more substrate binding regions. 
In one embodiment, an enzymatic DNA molecule includes first and second substrate 
binding regions. In another embodiment, an enzymatic DNA molecule includes two or 
more substrate binding regions. 
10 as noted previously, preferred catalytic DNA molecules of the present invention 

may also include a conserved core. In one preferred embodiment, the conserved core 
comprises one or more conserved regions. In other preferred variations, the one or more 
conserved regions include a nucleotide sequence selected from the group consisting of 
CG; CGA; AGCG; AGCCG; CAGCGAT; CTTGTTT; and CTTATTT (see, e.g.. Fig. 3). 
1 5 in one embodiment of the invention, an enzymatic DNA molecule of the present 

invention further comprises one or more variable or spacer nucleotides between the 
conserved regions in the conserved core. In another embodiment, an enzymatic DNA a 
molecule of the present invention further comprises one or more variable or spacer 
nucleotides between the conserved core and the substrate binding region. 
20 In one variation, the first substrate binding region preferably includes a -s 

nucleotide sequence selected from the group consisting of CATCTCT; GCTCT; 
TTGCTTTTT; TGTCTTCTC; TTGCTGCT; GCCATGCTTT (SEQ ID NO 40); CTCTATTTCT 
(SEQ ID NO 41); GTCGGCA; CATCTCTTC; and ACTTCT. In another preferred variation, 
the second substrate binding region includes a nucleotide sequence selected from the 
25 group consisting of TATGTGACGCTA (SEQ ID NO 42); TATAGTCGTA (SEQ ID NO 43); 

ATAGCGTATTA (SEQ ID NO 44); ATAGTTACGTCAT (SEQ ID NO 45); 
AATAGTGAAGTGTT (SEQ ID NO 46); TATAGTGTA; ATAGTCGGT; ATAGGCCCGGT 
(SEQ ID NO 47); AATAGTGAGGCTTG (SEQ ID NO 48); and ATGNTG. 

In various embodiments of the present invention, the substrate binding regions 
30 vary in length. Thus, for example, a substrate binding region may comprise a single 

nucleotide to dozens of nucleotides. However, it is understood that substrate binding 
regions of about 3-25 nucleotides in length, preferably about 3-15 nucleotides in length, 
and more preferably about 3-10 nucleotides in length are particularly preferred, tn 
various embodiments, the individual nucleotides in the substrate binding regions are able 
35 to form complementary base pairs with the nucleotides of the substrate molecules; in 

other embodiments, noncomplementary base pairs are formed. A mixture of 
complementary and noncomplementary base pairing is also contemplated as falling 
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within the scope of the disclosed embodiments of the invention. 

In another preferred embodiment, a catalytic DNA molecule of the present 
invention may further comprise a third substrate binding region. In some preferred 
embodiments, the third region includes a nucleotide sequence selected from the group 
consisting of TGTT; TGTTA; and TGTTAG. Another preferred embodiment of the 
present invention discloses an enzymatic DNA molecule further comprising one or more 
variable or "spacer" regions between the substrate binding regions. 

In another disclosed embodiment, the present invention contemplates a purified, 
synthetic enzymatic DNA molecule separated from other DNA molecules and 
oligonucleotides, the enzymatic DNA molecule having an endonuclease activity, wherein 
the endonuclease activity is specific for a nucleotide sequence defining a cleavage site 
comprising single- or double-stranded nucleic acid in a substrate nucleic acid sequence. 
In one variation, a synthetic (or engineered) enzymatic DNA molecule having an 
endonuclease activity is disclosed, wherein the endonuclease activity is specific for a 
nucleotide sequence defining a cleavage site consisting essentially of a single- or double- 
stranded region of a substrate nucleic acid sequence. 

In yet another embodiment, the invention contemplates an enzymatic DNA 
molecule comprising a deoxyribonucieotide polymer having a catalytic activity for 
hydrolyzing a nucleic acid-containing substrate to produce substrate cleavage products. 
In one variation, the hydrolysis takes place in a site-specific manner. As noted 
previously, the polymer may be single-stranded, double-stranded, or some combination 
of both. 

The invention further contemplates that the substrate comprises a nucleic acid 
sequence. In various embodiments, the nucleic acid sequence substrate comprises 
RNA, modified RNA, DNA, modified DNA, one or more nucleotide analogs, or 
composites of any of the foregoing. One embodiment contemplates that the substrate 
includes a single-stranded segment; still another embodiment contemplates that the 
substrate is double-stranded. 

The present invention also contemplates an enzymatic DNA molecule comprising 
a deoxyribonucieotide polymer having a catalytic activity for hydrolyzing a nucleic acid- 
containing substrate to produce a cleavage product. In one variation, the enzymatic 
DNA molecule has an effective binding affinity for the substrate and lacks an effective 
binding affinity for the cleavage product. 

In one preferred embodiment, the invention discloses a non-naturally-occurring 
enzymatic DNA molecule comprising a nucleotide sequence defining a conserved core 
flanked by recognition domains, variable regions, and spacer regions. Thus, in one 
preferred embodiment, the nucleotide sequence defines a first variable region contiguous 
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or adjacent to the 5'-terminus of the molecule, a first recognition domain located 3'- 
terminal to the first variable region, a first spacer region located 3'-terminal to the first 
recognition domain, a first conserved region located 3'-terminal to the first spacer 
region, a second spacer region located 3'-terminal to the first conserved region, a 
second conserved region located 3'-terminal to the second spacer region, a second 
recognition domain located 3*-terminal to the second conserved region, and a second 
variable region located 3*-terminal to the second recognition domain. 

In another embodiment, the nucleotide sequence preferably defines a first 
variable region contiguous or adjacent to the 5 '-terminus of the molecule, a first 
recognition domain located 3'-terminal to the first variable region, a first spacer region 
located 3*-terminal to the first recognition domain, a first conserved region located 3'- 
terminal to the first spacer region, a second spacer region located 3'-terminal to the first 
conserved region, a second conserved region located 3'-terminal to the second spacer 
region, a second recognition domain located 3' -terminal to the second conserved region, 
a second variable region located 3 -terminal to the second recognition domain, and a 
third recognition domain located 3 -terminal to the second variable region. 

In one variation of the foregoing, the molecule includes a conserved core region 
flanked by two substrate binding domains; in another, the conserved core region 
comprises one or more conserved domains. In other preferred embodiments, the 
conserved core region further comprises one or more variable or spacer nucleotides. In 
yet another embodiment, an enzymatic DNA molecule of the present invention further 
comprises one or more spacer regions. 

The present invention further contemplates a wide variety of compositions. For 
example, compositions including an enzymatic DNA molecule as described hereinabove 
are disclosed and contemplated herein. In one alternative embodiment, a composition 
according to the present invention comprises two or more populations of enzymatic 
DNA molecules as described above, wherein each population of enzymatic DNA 
molecules is capable of cleaving a different sequence in a substrate. In another 
variation, a composition comprises two or more populations of enzymatic DNA 
molecules as described hereinabove, wherein each population of enzymatic DNA 
molecules is capable of recognizing a different substrate. In various embodiments, it is 
also preferred that compositions include a monovalent or divalent cation. 

The present invention further contemplates methods of generating, selecting, 
and isolating enzymatic DNA molecules of the present invention. In one variation, a 
method of selecting enzymatic DNA molecules that cleave a nucleic acid sequence <e.g 
RNA) at a specific site comprises the following steps: (a) obtaining a population of 
putative enzymatic DNA molecules -- whether the sequences are naturally-occurring or 
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synthetic and preferably, they are single-stranded DNA molecules; (b) admixing 
nucleotide-containing substrate sequences with the aforementioned population of DNA 
molecules to form an admixture; (c) maintaining the admixture for a sufficient period of 
time and under predetermined reaction conditions to allow the putative enzymatic DNA 
molecules in the population to cause cleavage of the substrate sequences, thereby 
producing substrate cleavage products; (d) separating the population of DNA molecules 
from the substrate sequences and substrate cleavage products; and (e) isolating DNA 
molecules that cleave substrate nucleic acid sequences (e.g., RNA) at a specific site 
from the population. 

In a further variation of the foregoing method, the DNA molecules that cleave 
substrate nucleic acid sequences at a specific site are tagged with an immobilizing 
agent. In one example, the agent comprises biotin. 

In yet another variation of the aforementioned method, one begins by selecting a 
sequence e.g., a predetermined "target" nucleotide sequence that one wishes to 
cleave using an enzymatic DNA molecule engineered for that purpose. Thus, in one 
embodiment, the pre-selected (or predetermined) "target" sequence is used to generate 
a population of DNA molecules capable of cleaving substrate nucleic acid sequences at 
a specific site via attaching or "tagging" it to a deoxyribonucleic acid sequence 
containing one or more randomized sequences or segments. In one variation, the 
randomized sequence is about 40 nucleotides in length; in another variation, the 
randomized sequence is about 50 nucleotides in length. Randomized sequences that are 
1-40, 40-50, and 50-100 nucleotides in length are also contemplated by the present 
invention. 

In one embodiment of the present invention, the nucleotide sequence used to 
generate a population of enzymatic DNA molecules is selected from the group consisting 
of SEQ ID NO 4, 23, 50 AND 51. In another embodiment, the "target" or "substrate" 
nucleotide sequence comprises a sequence of one or more ribonucleotides see, e.g., 
the relevant portions of SEQ ID NOS 4 and 23, and SEQ ID NO 49. It is also 
contemplated by the present invention that a useful "target" or "substrate" nucleotide 
sequence may comprise DNA, RNA, or a composite thereof. 

The invention also contemplates methods as described above, wherein the 
isolating step further comprises exposing the tagged DNA molecules to a solid surface 
having avidin linked thereto, whereby the tagged DNA molecules become attached to 
the solid surface. As before, the substrate may be RNA, DNA, a composite of both, or 
a molecule including nucleotide sequences. 

The present invention also contemplates a method for specifically cleaving a 
substrate nucleic acid sequence at a particular cleavage site, comprising the steps of (a) 
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providing an enzymatic DNA molecule capable of cleaving a substrate nucleic acid 
sequence at a specific cleavage site; and (b) contacting the enzymatic DNA molecule 
with the substrate nucleic acid sequence to cause specific cleavage of the nucleic acid 
sequence at the cleavage site. In one variation, the enzymatic DNA molecule is a non- 
5 naturally-occurring (or synthetic) DNA molecule. In another variation, the enzymatic 

DNA molecule is single-stranded. 

In still another variation of the foregoing method, the substrate comprises a 
nucleic acid. In various embodiments, the substrate nucleic acid comprises RNA, 
modified RNA, DNA, modified DNA, one or more nucleotide analogs, or composites of 
10 any of the foregoing. In yet another embodiment, the specific cleavage is caused by the 

endonuclease activity of the enzymatic DNA molecule. Alteration of reaction conditions 
-- e.g., the adjustment of pH, temperature, percent cation, percent enzyme, percent 
substrate, and percent product - is also contemplated herein. 

The present invention also contemplates a method of cleaving a phosphoester 
1 5 bond, comprising (a) admixing an catalytic DNA molecule capable of cleaving a > 

substrate nucleic acid sequence at a defined cleavage site with a phosphoester bond- . 
containing substrate, to form a reaction admixture; and (b> maintaining the admixture * 
under predetermined reaction conditions to allow the enzymatic DNA molecule to cleave 
the phosphoester bond, thereby producing a population of substrate products. In one 
20 embodiment, the enzymatic DNA molecule is able to cleave the phosphoester bond in a 

site-specific manner. In another embodiment, the method further comprises the steps of, 
(c) separating the products from the catalytic DNA molecule; and (d> adding additional - 
substrate to the enzymatic DNA molecule to form a new reaction admixture. 

The present invention also contemplates methods of engineering enzymatic DNA 
25 molecules that cleave phosphoester bonds. One exemplary method comprises the 

following steps: (a) obtaining a population of single-stranded DNA molecules; <b) 
introducing genetic variation into the population to produce a variant population; (c) 
selecting individuals from the variant population that meet predetermined selection 
criteria; (d) separating the selected individuals from the remainder of the variant 
30 population; and (e) amplifying the selected individuals. 

RRIFF nFSCRIPTIQ N nF THE DRAWINGS 
Figure 1 illustrates a selective amplification scheme for isolation of DNAs that 
cleave a target RNA phosphoester. As shown, double-stranded DNA that contains a 
stretch of 50 random nucleotides (the molecule with "N 50 " indicated above it) is 
35 amplified by PCR, employing a 5 '-biotinylated DNA primer that is terminated at the 3 ' 

end by an adenosine ribonucleotide (rA). (The biotin label is indicated via the encircled 
letter "B".) This primer is extended by Tag polymerase to yield a DNA product that 
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contains a single embedded ribonucleotide. The resulting double-stranded DNA is 
immobilized on a streptavidin matrix and the unbiotinylated DNA strand is removed by 
washing with 0.2 N NaOH. After re-equilibrating the column with a buffered solution, 
the column is washed with the same solution with added 1 mM PbOAc. DNAs that 
undergo Pb 2 + -dependent self-cleavage are released from the column, collected in the 
eluant, and amplified by PGR. The PCR products are then used to initiate the next round 
of selective amplification. 

Figure 2 illustrates self-cleavage activity of the starting pool of DNA (GO) and 
populations obtained after the first through fifth rounds of selection <G1 - G5), in the 
presence of lead cation (Pb 2 + ). The symbol Pre represents 108-nucleotide precursor 
DNA (SEQ ID NO 4); Civ, 28-nucleotide 5 -cleavage product (SEQ ID NO 5); and M, 
primer 3a (SEQ ID NO 6), which corresponds in length to the 5 '-cleavage product. 

Figure 3 illustrates the sequence alignment of individi/al variants isolated from 
the population after five rounds of selection. The fixed substrate domain is shown at 
the top, with the target riboadenylate identified via an inverted triangle. Substrate 
nucleotides that are commonly involved in presumed base-pairing interactions are 
indicated by vertical bars. Sequences corresponding to the 50 initially-randomized 
nucleotides are aligned antiparallel to the substrate domain. All of the variants are 
3 '-terminated by the fixed sequence 5 ' -CGGTAAGCTTGGC AC-3 * (not shown; SEQ ID 
NO 1). Nucleotides within the initially-randomized region that are presumed to form 
base pairs with the substrate domain are indicated on the right and left sides of the 
Figure; the putative base-pair-forming regions of the enzymatic DNA molecules are 
individually boxed in each sequence shown. Conserved regions are illustrated via the 
two large, centrally-located boxes. 

Figures 4A and 4B illustrate DNA-catalyzed cleavage of an RNA phosphoester in 
an intermolecular reaction that proceeds with catalytic turnover. Figure 4A is a 
diagrammatic representation of the complex formed between the 1 9mer substrate <3'- 
TCACTATrAGGAAGAGATGG-5\ SEQ ID NO 2) and 38mer DNA enzyme (5'- 
ACACATCTCTGAAGTAGCGCCGCCGTATAGTGACGCTA-3\ SEQ ID NO 3). The 
substrate contains a single adenosine ribonucleotide CrA", adjacent to the arrow), 
flanked by deoxyribonucieotides. The synthetic DNA enzyme is a 38-nucleotide portion 
of the most frequently occurring variant shown in Fig. 3. Highly-conserved nucleotides 
located within the putative catalytic domain are "boxed". As illustrated, one conserved 
sequence is "AGCG", while another is "CG" (reading in the 5'-3* direction). 

Figure 4B shows an Eadie-Hofstee plot used to determine K m (negative slope) 
and V ma)l (y-intercept) for DNA-catalyzed cleavage of 15 '- 32 P)-labeled substrate under 
conditions identical to those employed during in vitro selection. Initial rates of cleavage 
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were determined for reactions involving 5 nM DNA enzyme and either 0.125, 0.5, 1, 2, 
or 4 /iM substrate. 

Figure 5 is a photographic representation showing a polyacrylamide gel 
demonstrating specific endoribonuclease activity of four families of selected catalytic 
DNAs. Selection of a Pb 2 + -dependent family of molecules was repeated in a side-by- 
side fashion as a control {first group). In the second group, Zn 2+ is used as the cation; 
in group three, the cation is Mn 2 + ; and in the fourth group, the cation is Mg 2 + . A fifth 
site on the gel consists of the cleavage product alone, as a marker. 

As noted, there are three lanes within each of the aforementioned four groups. 
In each group of three lanes, the first lane shows the lack of activity of the selected 
population in the absence of the metal cation, the second lane shows the observed 
activity in the presence of the metal cation, and the third lane shows the lack of activity 
of the starting pool (GO). 

Figures 6A and 6B provide two-dimensional illustrations of a "progenitor" 
catalytic DNA molecule and one of several catalytic DNA molecules obtained via the ~ 
selective amplification methods disclosed herein, respectively. Figure 6A illustrates an 
exemplary molecule from the starting pool, showing the overall configuration of the - 
molecules represented by SEQ ID NO 23. As illustrated, various complementary 
nucleotides flank the random (N^) region. Figure 6B is a diagrammatic representation^ 
one of the Mg 2 + -dependent catalytic DNA molecules (or " DNAzymes**) generated via the 
within-described procedures. The location of the ribonucleotide in the substrate nucleic 
acid is indicated via the arrow in both Figs. 6A and 6B. 

Figure 7 illustrates some of the results of ten rounds of in vitro selective 
amplification carried out essentially as described in Example 5 hereinbelow. As shown, 
two sites and two families of catalysts emerged as displaying the most efficient 
cleavage of the target sequence. Cleavage conditions were essentially as indicated in 
Fig. 7, namely, 10mM Mg 2 + , pH 7.5, and 37°C; data collected after the reaction ran for 
2 hours is shown. Cleavage (%) is shown plotted against the number of generations 
(here, 0 through 10). The number/prevalence of catalytic DNA molecules capable of 
cleaving the target sequence at the indicated sites in the substrate is illustrated via the 
vertical bars, with cleavage at G 1 UAACUAGAGAU shown by the striped bars, and with 
cleavage at GUAACUA1 GAGAU illustrated via the open (lightly-shaded) bars. 

Figure 8 illustrates the nucleotide sequences, cleavage sites, and turnover rates 
of two catalytic DNA molecules of the present invention, clones 8-17 and 10-23. 
Reaction conditions were as shown, namely, 10mM Mg 2 *, pH 7.5, and 37°C. The 
DNAzyme identified as clone 8-17 is illustrated on the left, with the site of cleavage of 
the RNA substrate indicated by the arrow. The substrate sequence (5' - 
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GGAAAAAGUAACUAGAGAUGGAAG - 3') which is separate from the DNAzyme (i.e., 
intermolecular cleavage is shown) « is labeled as such. Similarly, the DNAzyme 
identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 
substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 
17 enzyme, the turnover rate was approximately 0.6 hr 1 ; for the 10-23 enzyme, the 
turnover rate was approximately 1 hr 1 . Noncomplementary pairings are indicated with a 
closed circle <•), whereas complementary pairings are indicated with a vertical line (|). 

Figure 9 further illustrates the nucleotide sequences, cleavage sites, and 
turnover rates of two catalytic DNA molecules of the present invention, clones 8-17 and 
10-23. Reaction conditions were as shown, namely, 10mM Mg 2 + , pH 7.5, and 37°C. 
As in Fig. 8, the DNAzyme identified as clone 8-17 is illustrated on the left, with the site 
of cleavage of the RNA substrate indicated by the arrow. The substrate sequence (5' - 
GGAAAAAGUAACUAGAGAUGGAAG - 3'} -which is separate from the DNAzyme (i.e., 
intermolecular cleavage is shown) is labeled as such. Similarly, the DNAzyme 
identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 
substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 
17 enzyme, k obs was approximately 0.002 min for the 10-23 enzyme, the value of k obs 
was approximately 0.01 min" 1 . Noncomplementary pairings are indicated with a closed 
circle (•), whereas complementary pairings are indicated with a vertical line (|). 

DETAILED DESCRIPTION! 

a. Definitions 

As used herein, the term "deoxyribozyme" is used to describe a DNA-containing 
nucleic acid that is capable of functioning as an enzyme. In the present disclosure, the 
term "deoxyribozyme" includes endoribonucleases and endodeoxyribonucleases, 
although deoxyribozymes with endoribonuclease activity are particularly preferred. 
Other terms used interchangeably with deoxyribozyme herein are "enzymatic DNA 
molecule", "DNAzyme", or "catalytic DNA molecule", which terms should all be 
understood to include enzymatically active portions thereof, whether they are produced 
synthetically or derived from organisms or other sources. 

The term "enzymatic DNA molecules" also includes DNA molecules that have 
complementarity in a substrate-binding region to a specified oligonucleotide target or 
substrate; such molecules also have an enzymatic activity which is active to specifically 
cleave the oligonucleotide substrate. Stated in another fashion, the enzymatic DNA 
molecule is capable of cleaving the oligonucleotide substrate intermolecularly. This 
complementarity functions to allow sufficient hybridization of the enzymatic DNA 
molecule to the substrate oligonucleotide to allow the intermolecular cleavage of the 
substrate to occur. While one-hundred percent (100%) complementarity is preferred, 
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complementarity in the range of 75-100% is also useful and contemplated by the 
present invention. 

Enzymatic DNA molecules of the present invention may alternatively be 
described as having nuclease or ribonuclease activity. These terms may be used 
interchangeably herein. 

The term "enzymatic nucleic acid" as used herein encompasses enzymatic RNA 
or DNA molecules, enzymatic RNA-DNA polymers, and enzymatically active portions or 
derivatives thereof, although enzymatic DNA molecules are a particularly preferred class 
of enzymatically active molecules according to the present invention. 

The term "endodeoxyribonuclease", as used herein, is an enzyme capable of 
cleaving a substrate comprised predominantly of DNA. The term "endoribonuclease", as 
used herein, is an enzyme capable of cleaving a substrate comprised predominantly of 
RNA. 

As used herein, the term "base pair" (bp) is generally used to describe a 
partnership of adenine (A) with thymine (T) or uracil (U), or of cytosine (C) with guanine 
(G), although it should be appreciated that less-common analogs of the bases A, T, C. . : 
and G (as well as U) may occasionally participate in base pairings. Nucleotides that tf 
normally pair up when DNA or RNA adopts a double stranded configuration may also be 
referred to herein as "complementary bases". i 

"Complementary nucleotide sequence" generally refers to a sequence of 
nucleotides in a single-stranded molecule or segment of DNA or RNA that is sufficiently 
complementary to that on another single oligonucleotide strand to specifically hybridize- 
to it with consequent hydrogen bonding. 

"Nucleotide" generally refers to a monomeric unit of DNA or RNA consisting of a 
sugar moiety (pentose), a phosphate group, and a nitrogenous heterocyclic base. The 
base is linked to the sugar moiety via the glycosidic carbon (1* carbon of the pentose) 
and that combination of base and sugar is a "nucleoside". When the nucleoside 
contains a phosphate group bonded to the 3" or 5 1 position of the pentose, it is referred 
to as a nucleotide. A sequence of operatively linked nucleotides is typically referred to 
herein as a "base sequence" or "nucleotide sequence", and their grammatical 
equivalents, and is represented herein by a formula whose left to right orientation is in 
the conventional direction of 5'-terminus to 3'-terminus. unless otherwise specified. 

"Nucleotide analog" generally refers to a purine or pyrimidine nucleotide that 
differs structurally from A, T, G. C. or U, but is sufficiently similar to substitute for the 
normal nucleotide in a nucleic acid molecule. As used herein, the term "nucleotide 
analog" encompasses altered bases, different or unusual sugars (i.e. sugars other than 
the "usual" pentose), or a combination of the two. A listing of exemplary analogs 
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wherein the base has been altered is provided in section C hereinbelow. 

"Oligonucleotide or polynucleotide" generally refers to a polymer of single- or 
double-stranded nucleotides. As used herein, "oligonucleotide" and its grammatical 
equivalents will include the full range of nucleic acids. An oligonucleotide will typically 
5 refer to a nucleic acid molecule comprised of a linear strand of ribonucleotides. The 

exact size will depend on many factors, which in turn depends on the ultimate 
conditions of use, as is well known in the art. 

As used herein, the term "physiologic conditions" is meant to suggest reaction 
conditions emulating those found in mammalian organisms, particularly humans. While 

10 variables such as temperature, availability of cations, and pH ranges may vary as 

described in greater detail below, "physiologic conditions" generally comprise a 
temperature of about 35-40°C, with 37°C being particularly preferred, as well as a pH 
of about 7.0-8.0, with 7.5 being particularly preferred, and further comprise the 
availability of cations, preferably divalent and/or monovalent cations, with a 

15 concentration of about 2-15 mM Mg 2+ and 0-1.0 M Na + being particularly preferred. 

"Physiologic conditions", as used herein, may optionally include the presence of free 
nucleoside cofactor. As noted previously, preferred conditions are described in greater 
detail below. 

B. Enzvmatic DNA Molecules 

20 In various embodiments, an enzymatic DNA molecule of the present invention 

may combine one or more modifications or mutations including additions, deletions, and 
substitutions. In alternative embodiments, such mutations or modifications may be 
generated using methods which produce random or specific mutations or modifications. 
These mutations may, for example, change the length of, or alter the nucleotide 

25 sequence of, a loop, a spacer region or the recognition sequence (or domain). One or 

more mutations within one catalytically active enzymatic DNA molecule may be 
combined with the mutation(s) within a second catalytically active enzymatic DNA 
molecule to produce a new enzymatic DNA molecule containing the mutations of both 
molecules. 

30 In other preferred embodiments, an enzymatic DNA molecule of the present 

invention may have random mutations introduced into it using a variety of methods well 
known to those skilled in the art. For example, the methods described by Cadwell and 
Joyce ( PGR Methods and Applications 2 : 28-33 (1992)) are particularly preferred for use 
as disclosed herein, with some modifications, as described in the Examples that follow. 

35 {Also see Cadwell and Joyce, PCR Methods and Applications 3 (SuppM: S136-S140 

(1994).) According to this modified PCR method, random point mutations may be 
introduced into cloned genes. 
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The aforementioned methods have been used, for example, to mutagenize genes 
encoding ribozymes with a mutation rate of 0.66% ± 0.13% (95% confidence interval) 
per position, as determined by sequence analysis, with no strong preferences observed 
with respect to the type of base substitution. This allows the introduction of random 
5 mutations at any position in the enzymatic DNA molecules of the present invention. 

Another method useful in introducing defined or random mutations is disclosed 
in Joyce and Inoue, Nucleic Acids Research 17: 711-722 (1989). This latter method 
involves excision of a template (coding) strand of a double-stranded DNA, reconstruction 
of the template strand with inclusion of mutagenic oligonucleotides, and subsequent 
10 transcription of the partially-mismatched template. This allows the introduction of 

defined or random mutations at any position in the molecule by including 
polynucleotides containing known or random nucleotide sequences at selected positions. 

Enzymatic DNA molecules of the present invention may be of varying lengths 
and folding patterns, as appropriate, depending on the type and function of the 
1 5 molecule. For example, enzymatic DNA molecules may be about 15 to about 400 or 

more nucleotides in length, although a length not exceeding about 250 nucleotides is , 
preferred, to avoid limiting the therapeutic usefulness of molecules by making them too 
large or unwieldy. In various preferred embodiments, an enzymatic DNA molecule of the 
present invention is at least about 20 nucleotides in length and, while useful molecules 
20 may exceed 100 nucleotides in length, preferred molecules are generally not more than ^ 

about 100 nucleotides in length. 

In various therapeutic applications, enzymatic DNA molecules of the present 
invention comprise the enzymatically active portions of deoxyribozymes. In various 
embodiments, enzymatic DNA molecules of the present invention preferably comprise 
25 not more than about 200 nucleotides. In other embodiments, a deoxyribozyme of the 

present invention comprises not more than about 100 nucleotides. In still other 
preferred embodiments, deoxyribozymes of the present invention are about 20-75 
nucleotides in length, more preferably about 20-65 nucleotides in length. Other 
preferred enzymatic DNA molecules are about 10-50 nucleotides in length. 
30 In other applications, enzymatic DNA molecules may assume configurations 

similar to those of "hammerhead" ribozymes. Such enzymatic DNA molecules are 
preferably no more than about 75-100 nucleotides in length, with a length of about 20- 
50 nucleotides being particularly preferred. 

In general, if one intends to synthesize molecules for use as disclosed herein, the 
35 larger the enzymatic nucleic acid molecule is. the more difficult it is to synthesize. 

Those of skill in the art will certainly appreciate these design constraints. Nevertheless, 
such larger molecules remain within the scope of the present invention. 
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It is also to be understood that an enzymatic DNA molecule of the present 
invention may comprise enzymatically active portions of a deoxyribozyme or may 
comprise a deoxyribozyme with one or more mutations, e.g., with one or more base- 
pair-forming sequences or spacers absent or modified, as long as such deletions, 
additions or modifications do not adversely impact the molecule's ability to perform as 
an enzyme. 

The recognition domain of an enzymatic DNA molecule of the present invention 
typically comprises two nucleotide sequences flanking a catalytic domain, and typically 
contains a sequence of at least about 3 to about 30 bases, preferably about 6 to about 
15 bases, which are capable of hybridizing to a complementary sequence of bases 
within the substrate nucleic acid giving the enzymatic DNA molecule its high sequence 
specificity. Modification or mutation of the recognition site via well-known methods 
allows one to alter the sequence specificity of an enzymatic nucleic acid molecule. 
(See, e.g, Joyce et al., Nucleic Acids Research 17: 71 1*712 (1989.)) 

Enzymatic nucleic acid molecules of the present invention also include those 
with altered recognition sites or domains. In various embodiments, these altered 
recognition domains confer unique sequence specificities on the enzymatic nucleic acid 
molecule including such recognition domains. The exact bases present in the 
recognition domain determine the base sequence at which cleavage will take place. 
Cleavage of the substrate nucleic acid occurs within the recognition domain. This 
cleavage leaves a 2", 3', or 2\3'-cyclic phosphate group on the substrate cleavage 
sequence and a 5* hydroxy! on the nucleotide that was originally immediately 3* of the 
substrate cleavage sequence in the original substrate. Cleavage can be redirected to a 
site of choice by changing the bases present in the recognition sequence (internal guide 
sequence). See Murphy et al., Proc. Natl. Acad. ScL USA 86 : 9218-9222 (1989). 

Moreover, it may be useful to add a polyamine to facilitate recognition and 
binding between the enzymatic DNA molecule and its substrate. Examples of useful 
polyamines include spermidine, putrescine or spermine. A spermidine concentration of 
about 1 mM may be effective in particular embodiments, while concentrations ranging 
from about 0.1 mM to about 10 mM may also be useful. 

In various alternative embodiments, an enzymatic DNA molecule of the present 
invention has an enhanced or optimized ability to cleave nucleic acid substrates, 
preferably RNA substrates. As those of skill in the art will appreciate, the rate of an 
enzyme-catalyzed reaction varies depending upon the substrate and enzyme 
concentrations and, in general, levels off at high substrate or enzyme concentrations. 
Taking such effects into account, the kinetics of an enzyme-catalyzed reaction may be 
described in the following terms, which define the reaction. 
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The enhanced or optimized ability of an enzymatic DNA molecule of the present 
invention to cleave an RNA substrate may be determined in a cleavage reaction with 
varying amounts of labeled RNA substrate in the presence of enzymatic DNA molecule. 
The ability to cleave the substrate is generally defined by the catalytic rate (k cal ) divided 
5 by the Michaelis constant (K M ). The symbol k cat represents the maximal velocity of an 

enzyme reaction when the substrate approaches a saturation value. K M represents the 
substrate concentration at which the reaction rate is one-half maximal. 

For example, values for K M and k C3t may be determined in this invention by 
experiments in which the substrate concentration IS] is in excess over enzymatic DNA 
10 molecule concentration [El. Initial rates of reaction (v 0 ) over a range of substrate 

concentrations are estimated from the initial linear phase, generally the first 5% or less 
of the reaction. Data points are fit by a least squares method to a theoretical line given 
by the equation: v = -K M (v 0 /[S]) + V max . Thus, k cat and K M are determined by the initial 
rate of reaction, v Q , and the substrate concentration [S]. 
1 5 in various alternative embodiments, an enzymatic DNA molecule of the present- 

invention has an enhanced or optimized ability to cleave nucleic acid substrates, 
preferably RNA substrates. In preferred embodiments, the enhanced or optimized ability 
of an enzymatic DNA molecule to cleave RNA substrates shows about a 10- to 10 9 -fold 
improvement over the uncatalyzed rate. In more preferred embodiments, an enzymatic * 
20 DNA molecule of the present invention is able to cleave RNA substrates at a rate that is 

about 10 3 - to 10 7 -fold improved over "progenitor" species. In even more preferred 
embodiments, the enhanced or optimized ability to cleave RNA substrates is expressed ~ 
as a 10 4 - to 10 6 -fold improvement over the progenitor species. One skilled in the art will 
appreciate that the enhanced or optimized ability of an enzymatic DNA molecule to 
25 cleave nucleic acid substrates may vary depending upon the selection constraints 

applied during the in vitro evolution procedure of the invention. 

Various preferred methods of modifying deoxyribozymes and other enzymatic 
DNA molecules and nucleases of the present invention are further described in Examples 
1-3 hereinbelow. 
30 C. r°V r l fintidfi Analogs 

As noted above, the term "nucleotide analog" as used herein generally refers to 
a purine or pyrimidine nucleotide that differs structurally from A, T, G, C, or U, but is 
sufficiently similar to substitute for such "normal" nucleotides in a nucleic acid molecule. 
As used herein, the term "nucleotide -analog" encompasses altered bases, different (or 
35 unusual) sugars, altered phosphate backbones, or any combination of these alterations. 

Examples of nucleotide analogs useful according to the present invention include those 
listed in the following Table, most of which are found in the approved listing of modified 



WO 96/17086 




PCT/US95/15580 



-18- 

bases at 37 CFR §1.822 (which is incorporated herein by reference). 

Table 1 

Nucleotide Analogs 



Abbreviation 



Description 



10 



15 



20 



25 



30 



35 



ac4c 

chm5u 

cm 

cmnm5s2u 
d 

fm 
galq 
gm 
I 

i6a 

ml a 

m1f 

m1g 

mil 

m22g 

m2a 

m2g 

m3c 

m5c 

m6a 

m7g 

mam5u 

mam5s2u 

manq 

mcm5s2u 

mo5u 

ms2i6a 

ms2t6a 

mt6a 



4- acetylcytidine 

5- (carboxyhydroxylmethyl)uridine 
2'-0-methylcytidine 

5-carboxymethylaminomethyl-2-thiouridine 

dihydrouridine 

2'-0-methylpseudouridine 

K, D-gatactosylqueosine 

2 t -0-methylguanosine 

inosine 

N6-isopentenyladenosine 
1 -methyiadenosine 
1 -methylpseudouridine 
1 -methylguanosine 

1- methylinosine 
2,2-dimethylguanosine 

2- methyladenosine 

2- methylguanosine 

3- methylcytidine 
5-methylcytidine 
N6-methyladenosine 
7-methylguanosine 
5-methylaminomethyluridine 
5-methoxyaminomethyl-2-thiouridine 
S, D-mannosylmethyluridine 
5-methoxycarbonylmethyluridine 
5-methoxyuridine 

2-methylthio-N6-isopentenyladenosine 

N-{(9-B-D-ribofuranosyl-2-methylthiopurine-6- 

yOcarbamoyDthreonine 

N-((9-R-D-ribofuranosylpurine-6-yl)N-methyl- 
carbamoyDthreonine 
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Abbreviation 



Description 



10 



15 



20 



mv 

o5u 

osyw 

P 

q 

s2c 
s2t 
s2u 
s4u 
t 

t6a 

urn 
yw 
x 

araU 
araT 



uridine-5-oxyacetic acid methylester 

uridine-5-oxyacetic acid (v) 

wybutoxosine 

pseudouridine 

queosine 

2-thiocytidine 

5-methyl-2-thiouridine 

2- thiouridine 

4- thiouridine 

5- methyluridine 

N-((9-S-D-ribofuranosylpurine-6-yl)carbamoyl)threoninetm 

2'-0-methyl-5-methyluridine 

2'-0-methyluridine 

wybutosine 

3- (3-amino-3-carboxypropyl)uridine, (acp3)u 
G>, D-arabinosyl 

R, D-arabinosyl 



25 Other useful analogs include those described in published international 

application no. WO 92/20823 (the disclosures of which are incorporated herein by 
reference), or analogs made according to the methods disclosed therein. Analogs 
described in DeMesmaeker, et al., Annew. ChRm. Int. Ed. Engl, 33: 226-229 (1994); 
DeMesmaeker, et al., Svnlett : 733-736 (Oct. 1993); Nielsen, et al., fiance 2 54 : 1497- 

30 1500 (1991); and Idziak, et al., Tetrahedron Letters 34: 5417-5420 (1993) are also 

useful according to the within-disclosed invention and said disclosures are incorporated 
by reference herein. 

D. Methods of Engineering Fn7vmatic DNA Molecules 

The present invention also contemplates methods of producing nucleic acid 

35 molecules having a predetermined activity. In one preferred embodiment, the nucleic 

acid molecule is an enzymatic DNA molecule. In another variation, the desired activity is 
a catalytic activity. 

In one embodiment, the present invention contemplates methods of synthesizing 
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enzymatic DNA molecules that may then be "engineered" to catalyze a specific or 
predetermined reaction. Methods of preparing enzymatic DNA molecules are described 
herein; see, e.g.. Examples 1-3 hereinbelow. In other embodiments, an enzymatic DNA 
molecule of the present invention may be engineered to bind small molecules or ligands, 
such as adenosine triphosphate (ATP). (See, e.g., Sassanfar, et al.. Nature 364 : 550- 
553 (1993).) 

In another embodiment, the present invention contemplates that a population of 
enzymatic DNA molecules may be subjected to mutagenizing conditions to produce a 
diverse population of mutant enzymatic DNA molecules (which may alternatively be 
called "deoxyribozymes" or "DNAzymes"). Thereafter, enzymatic DNA molecules having 
desired characteristics are selected and/or separated from the population and are 
subsequently amplified. 

Alternatively, mutations may be introduced in the enzymatic DNA molecule by 
altering the length of the recognition domains of the enzymatic DNA molecule. The 
recognition domains of the enzymatic DNA molecule associate with a complementary 
sequence of bases within a substrate nucleic acid sequence. Methods of altering the 
length of the recognition domains are known in the art and include PCR, for example; 
useful techniques are described further in the Examples below. 

Alteration of the length of the recognition domains of an enzymatic DNA 
molecule may have a desirable effect on the binding specificity of the enzymatic DNA 
molecule. For example, an increase in the length of the recognition domains may 
increase binding specificity between the enzymatic DNA molecule and the 
complementary base sequences of an oligonucleotide in a substrate, or may enhance 
recognition of a particular sequence in a hybrid substrate. In addition, an increase in the 
length of the recognition domains may also increase the affinity with which it binds to 
substrate. In various embodiments, these altered recognition domains in the enzymatic 
DNA molecule confer increased binding specificity and affinity between the enzymatic 
DNA molecule and its substrate. 

It has recently been noted that certain oligonucleotides are able to recognize and 
bind molecules other than oligonucleotides with complementary sequences. These 
oligonucleotides are often given the name "aptamers'V For example, Ellington and 
Szostak describe RNA molecules that are able to bind a variety of organic dyes ( Nature 
346 : 818-822 (1990)), while Bock, et al. describe ssDNA molecules that bind human 
thrombin ( Nature 355 : 564-566 (1992)). Similarly, Jellinek, et al. describe RNA ligands 
to basic fibroblast growth factor ( PNAS USA 90 : 1 1 227-1 1 231 (1 993)). Thus, it is 
further contemplated herein that the catalytically active DNA enzymes of the present 
invention may be engineered according to the within-described methods to display a 



WO 96/17086 




►CTYUS95/15580 



-21- 

variety of capabilities typically associated with aptamers. 

One of skill in the art should thus appreciate that the enzymatic DNA molecules 
of this invention can be altered at any nucleotide sequence, such as the recognition 
domains, by various methods disclosed herein, including PCR and 3SR {self-sustained 
5 sequence replication « see Example 1 below). For example, additional nucleotides can 

be added to the 5' end of the enzymatic DNA molecule by including additional 
nucleotides in the primers. 

Enzymatic DNA molecules of the present invention may also be prepared or 
engineered in a more non-random fashion via use of methods such as site-directed 
10 mutagenesis. For example, site-directed mutagenesis may be carried out essentially as 

described in Morinaga, et al., Biotechnology 2 : 636 (1984), modified as described 
herein, for application to deoxyribozymes. Useful methods of engineering enzymatic 
DNA molecules are further described in the Examples below. 

In one disclosed embodiment, an enzymatic DNA molecule of the present 
1 5 invention comprises a conserved core flanked by two substrate binding (or recognition);^ 

domains or sequences that interact with the substrate through base-pairing interactions. 
In various embodiments, the conserved core comprises one or more conserved domams\ 
or sequences. In another variation, an enzymatic DNA molecule further comprises a / 
"spacer" region (or sequence) between the regions (or sequences) involved in base 
70 pairing. In still another variation, the conserved core is "interrupted" at various intervals * 

by one or more less-conserved variable or "spacer" nucleotides. 

In various embodiments, the population of enzymatic DNA molecules is made up 
of at least 2 different types of deoxyribozyme molecules. For example, in one variation, 
the molecules have differing sequences. In another variation, the deoxyribozymes are 
25 nucleic acid molecules having a nucleic acid sequence defining a recognition domain that 

is contiguous or adjacent to the B'-terminus of the nucleotide sequence. In various 
alternative embodiments, enzymatic DNA molecules of the present invention may further 
comprise one or more spacer regions located 3'-terminal to the recognition domains, one 
or more loops located 3'-terminal to the recognition domains and/or spacer regions. In 
30 other variations, a deoxyribozyme of the present invention may comprise one or more 

regions which are capable of hybridizing to other regions of the same molecule. Other 
characteristics of enzymatic DNA molecules produced according to the presently- 
disclosed methods are described elsewhere herein. 

In other embodiments, mutagenizing conditions include conditions that introduce 
35 either defined or random nucleotide substitutions within an enzymatic DNA molecule. 

Examples of typical mutagenizing conditions include conditions disclosed in other parts 
of this specification and the methods described by Joyce et at., Nucl. Acids Res, VI : 
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71 1-722 (1989); Joyce, Gene 82 : 83-87(1989); and Beaudry and Joyce, Science 257 : 
635-41 (1992). 

In still other embodiments, a diverse population of mutant enzymatic nucleic acid 
molecules of the present invention is one that contains at least 2 nucleic acid molecules 
that do not have the exact same nucleotide sequence. In other variations, from such a 
diverse population, an enzymatic DNA molecule or other enzymatic nucleic acid having a 
predetermined activity is then selected on the basis of its ability to perform the 
predetermined activity. In various embodiments, the predetermined activity comprises, 
without limitation, enhanced catalytic activity, decreased K M , enhanced substrate 
binding ability, altered substrate specificity, and the like. 

Other parameters which may be considered aspects of enzyme performance 
include catalytic activity or capacity, substrate binding ability, enzyme turnover rate, 
enzyme sensitivity to feedback mechanisms, and the like. In certain aspects, substrate 
specificity may be considered an aspect of enzyme performance, particularly in 
situations in which an enzyme is able to recognize and bind two or more competing 
substrates, each of which affects the enzyme's performance with respect to the other 
substrate(s). 

Substrate specificity, as used herein, may refer to the specificity of an enzymatic 
nucleic acid molecule as described herein for a particular substrate, such as one 
comprising ribonucleotides only, deoxyribonucleotides only, or a composite of both. 
Substrate molecules may also contain nucleotide analogs. In various embodiments, an 
enzymatic nucleic acid molecule of the present invention may preferentially bind to a 
particular region of a hybrid or non-hybrid substrate. 

The term or parameter identified herein as "substrate specificity" may also 
include sequence specificity; i.e., an enzymatic nucleic acid molecule of the present 
invention may "recognize" and bind to a nucleic acid substrate having a particular 
nucleic acid sequence. For example, if the substrate recognition domains of an 
enzymatic nucleic acid molecule of the present invention will only bind to substrate 
molecules having a series of one or two ribonucleotides (e.g., rA) in a row, then the 
enzymatic nucleic acid molecule will tend not to recognize or bind nucleic acid substrate 
molecules lacking such a sequence. 

With regard to the selection process, in various embodiments, selecting includes 
any means of physically separating the mutant enzymatic nucleic acids having a 
predetermined activity from the diverse population of mutant enzymatic nucleic acids. 
Often, selecting comprises separation by size, by the presence of a catalytic activity, or 
by hybridizing the mutant nucleic acid to another nucleic acid, to a peptide, or some 
other molecule that is either in solution or attached to a solid matrix. 
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In various embodiments, the predetermined activity is such that the mutant 
enzymatic nucleic acid having the predetermined activity becomes labeled in some 
fashion by virtue of the activity. For example, the predetermined activity may be an 
enzymatic DNA molecule activity whereby the activity of the mutant enzymatic nucleic 
5 acid upon its substrate causes the mutant enzymatic nucleic acid to become covalently 

linked to it. The mutant enzymatic nucleic acid is then selected by virtue of the 
covalent linkage. 

In other embodiments, selecting a mutant enzymatic nucleic acid having a 
predetermined activity includes amplification of the mutant enzymatic nucleic acid (see, 
10 e.g., Joyce, Gene 82 : 83-87 (1989); Beaudry and Joyce, Sftifince 257: 635-41 (1992)). 

Other methods of selecting an enzymatic nucleic acid molecule having a predetermined 
characteristic or activity are described in the Examples section. 
E. Compositions 

The invention also contemplates compositions containing one or more types or 

1 5 populations of enzymatic DNA molecules of the present invention; e.g., different types . 

or populations may recognize and cleave different nucleotide sequences. Compositions- 
may further include a ribonucleic acid-containing substrate. Compositions according to 
the present invention may further comprise lead ion, magnesium ion, or other divalent or 
monovalent cations, as discussed herein. 

20 Preferably, the enzymatic DNA molecule is present at a concentration of about 

0.05 //M to about 2 //M. Typically, the enzymatic DNA molecule is present at a 
concentration ratio of enzymatic DNA molecule to substrate of from about 1:5 to abouU- 
1 :50. More preferably, the enzymatic DNA molecule is present in the composition at a 
concentration of about 0.1 fjlA to about 1 //M. Even more preferably, compositions 

25 contain the enzymatic DNA molecule at a concentration of about 0.1 //M to about 0.5 

j/M. Preferably, the substrate is present in the composition at a concentration of about 
0.5 pM to about 1000 //M. 

One skilled in the art will understand that there are many sources of nucleic 
acid-containing substrates including naturally-occurring and synthetic sources. Sources 

30 of suitable substrates include, without limitation, a variety of viral and retroviral agents, 

including HIV-1, HIV-2, HTLV-I, and HTLV-II. 

Other suitable substrates include, without limitation, viral and retroviral agents 
including those comprising or produced by picornaviruses, hepadnaviridae (e.g., HBV, 
HCV), papillomaviruses (e.g., HPV), gammaherpesvirinae (e.g., EBV), 

35 lymphocryptoviruses, leukemia viruses (e.g., HTLV-I and -II), flaviviruses, togaviruses, 

herpesviruses (including alphaherpesviruses and betaherpesviruses), cytomegaloviruses 
(CMV), influenza viruses, and viruses and retroviruses contributing to immunodeficiency 
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diseases and syndromes (e.g., HIV-1 and -2). In addition, suitable substrates include 
viral and retroviral agents which infect non-human primates and other animals including, 
without limitation, the simian and feline immunodeficiency viruses and bovine leukemia 
viruses. 

Magnesium ion, lead ion, or another suitable monovalent or divalent cation, as 
described previously, may also be present in the composition, at a concentration ranging 
from about 1-100 mM. More preferably, the preselected ion is present in the 
composition at a concentration of about 2 mM to about 50 mM, with a concentration of 
about 5 mM being particularly preferred. One skilled in the art will understand that the 
ion concentration is only constrained by the limits of solubility of its source (e.g. 
magnesium) in aqueous solution and a desire to have the enzymatic DNA molecule 
present in the same composition in an active conformation. 

The invention also contemplates compositions containing an enzymatic DNA 
molecule of the present invention, hybrid deoxyribonucleotide-ribonucleotide molecules, 
and magnesium or lead ion in concentrations as described hereinabove. As noted 
previously, other monovalent or divalent ions (e.g., Ca 2 + ) may be used in place of 
magnesium. 

Also contemplated by the present invention are compositions containing an 
enzymatic DNA molecule of the present invention, nucleic acid-containing substrate (e.g. 
RNA}, and a preselected ion at a concentration of greater than about 1 millimolar, 
wherein said substrate is greater in length than the recognition domains present on the 
enzymatic DNA molecule. 

In one variation, a composition comprises an enzymatic DNA molecule-substrate 
complex, wherein base pairing between an enzymatic DNA molecule and its substrate is 
contiguous. In another embodiment, base pairing between an enzymatic DNA molecule 
and its substrate is interrupted by one or more noncomplementary pairs. In a variety of 
alternative embodiments, a composition of the present invention may further comprise a 
monovalent cation, a divalent cation, or both. 

In another variation, an enzymatic DNA molecule of the present invention is 
capable of functioning efficiently in the presence or absence of a divalent cation. In one 
variation, a divalent cation is present and comprises Pb 2 + , Mg 2 + , Mn 2 + , Zn 2 + , or Ca 2 + . 
Alternatively, an enzymatic DNA molecule of the present invention is capable of 
functioning efficiently in the presence or absence of monovalent cations. It is 
anticipated that monovalent or divalent cation concentrations similar to those described 
herein for Pb 2 * or Mg 2+ will be useful as disclosed herein. 

Optionally, monovalent cations may also be present in addition to, or as 
"alternatives" for, divalent cations. For example, monovalent cations such as sodium 
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(Na + ) or potassium (K + ) may be present, either as dissociated ions or in the form of 
dissociable compounds such as NaCI or KG. 

In one embodiment, the concentration of monovalent cation present in the 
composition ranges from O-I.OM. In another embodiment, a monovalent cation is 
5 present in a concentration ranging from about 0-200 mM. In other embodiments, 

monovalent cations are present in a concentration ranging from about 1-100 mM. 
Alternatively, the concentration of monovalent cations ranges from about 2 mM - 50 
mM. In still other embodiments, the concentration ranges from about 2 mM - 25 mM. 
F. Methods of Using Enzvmatio DNA Molecules 

10 The methods of using enzymatic DNA molecules as disclosed herein are legion. 

As discussed previously, molecules capable of cleaving the bonds linking neighboring 
nucleic acids (e.g., phosphoester bonds) have numerous uses encompassing a wide 
variety of applications. For example, enzymatic DNA molecules having the within- 
disclosed capabilities, structures, and/or functions are useful in pharmaceutical and 
15 medical products (e.g., for wound debridement, clot dissolution, etc.), as well as in 

household items (e.g., detergents, dental hygiene products, meat tenderizers). Industrial 
utility of the within-disclosed compounds, compositions and methods is also 
contemplated and well within the scope of the present invention. 

The present invention also describes useful methods for cleaving any single- ^ 
20 stranded, looped, partially or fully double-stranded nucleic acid; the majority of these ^ 

methods employ the novel enzymatically active nucleic acid molecules of the present ,^ 
invention. In various embodiments, the single-stranded nucleic acid segment or portion 
of the substrate (or the entire substrate itself) comprises DNA, modified DNA, RNA, 
modified RNA, or composites thereof. Preferably, the nucleic acid substrate need only 
25 be single-stranded at or near the substrate cleavage sequence so that an enzymatic 

nucleic acid molecule of the present invention can hybridize to the substrate cleavage 
sequence by virtue of the enzyme f s recognition sequence. 

A nucleic acid substrate that can be cleaved by a method of this invention may 
be chemically synthesized or enzymatically produced, or it may be isolated from various 
30 sources such as phages, viruses, prokaryotic cells, or eukaryotic cells, including animal 

cells, plant cells, yeast cells and bacterial cells. Chemically synthesized single- and 
double-stranded nucleic acids are commercially available from many sources including, 
without limitation, Research Genetics (Huntsville, AL). 

RNA substrates may also be synthesized using an Applied Biosystems (Foster 
35 City, CA) oligonucleotide synthesizer according to the manufacturer's instructions. 

Single-stranded phage are also a source of nucleic acid substrates. (See, e.g., Messing 
et al., PNAS USA 74 : 3642-3646 (1977), and Yanisch-Perron et al., Gene 33 : 103-119 
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(1985).) Bacterial cells containing single-stranded phage would also be a ready source 
of suitable single-stranded nucleic acid substrates. 

Single-stranded RNA cleavable by a method of the present invention could be 
provided by any of the RNA viruses such as the picornaviruses, togaviruses, 
orthomyxoviruses, paramyxoviruses, rhabdoviruses, coronaviruses, arenaviruses or 
retroviruses. As noted previously, a wide variety of prokaryotic and eukaryotic cells 
may also be excellent sources of suitable nucleic acid substrates. 

The methods of this invention may be used on single-stranded nucleic acids or 
single-stranded portions of looped or double-stranded nucleic acids that are present 
inside a cell, including eukaryotic, procaryotic, plant, animal, yeast or bacterial cells. 
Under these conditions an enzymatic nucleic acid molecule (e.g., an enzymatic DNA 
molecule or deoxyribozyme) of the present invention could act as an anti-viral agent or a 
regulator of gene expression. Examples of such uses of enzymatic DNA molecules of 
the present invention are described further hereinbelow. 

In the majority of methods of the present invention, cleavage of single-stranded 
nucleic acids occurs at the 3' -terminus of a predetermined base sequence. This 
predetermined base sequence or substrate cleavage sequence typically contains from 1 
to about 10 nucleotides. In other preferred embodiments, an enzymatic DNA molecule 
of the present invention is able to recognize nucleotides either upstream, or upstream 
and downstream of the cleavage site. In various embodiments, an enzymatic DNA 
molecule is able to recognize about 2-10 nucleotides upstream of the cleavage site; in 
other embodiments, an enzymatic DNA molecule is able to recognize about 2-10 
nucleotides upstream and about 2-10 nucleotides downstream of the cleavage site. 
Other preferred embodiments contemplate an enzymatic DNA molecule that is capable 
of recognizing a nucleotide sequence up to about 30 nucleotides in length, with a length 
up to about 20 nucleotides being even more preferred. 

The within-disclosed methods allow cleavage at any nucleotide sequence by 
altering the nucleotide sequence of the recognition domains of the enzymatic DNA 
molecule. This allows cleavage of single-stranded nucleic acid in the absence of a 
restriction endonuclease site at the selected position. 

An enzymatic DNA molecule of the present invention may be separated from any 
portion of the single-stranded nucleic acid substrate that remains attached to the 
enzymatic DNA molecule by site-specific hydrolysis at the appropriate cleavage site. 
Separation of the enzymatic DNA molecule from the substrate (or "cleavage product") 
allows the enzymatic DNA molecule to carry out another cleavage reaction. 

Generally, the nucleic acid substrate is treated under appropriate nucleic acid 
cleaving conditions ~ preferably, physiologic conditions — with an effective amount of 
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an enzymatic DNA molecule of the present invention. If the nucleic acid substrate 
comprises DNA, cleaving conditions may include the presence of a divalent cation at a 
concentration of about 2-1 OmM. 

An effective amount of an enzymatic DNA molecule is the amount required to 
cleave a predetermined base sequence present within the single-stranded nucleic acid. 
Preferably, the enzymatic DNA molecule is present at a molar ratio of DNA molecule to 
substrate cleavage sites of 1 to 20. This ratio may vary depending on the length of 
treating and efficiency of the particular enzymatic DNA molecule under the particular 
nucleic acid cleavage conditions employed. 

Thus, in one preferred embodiment, treating typically involves admixing, in 
aqueous solution, the RNA-containing substrate and the enzyme to form a cleavage 
admixture, and then maintaining the admixture thus formed under RNA cleaving 
conditions for a time period sufficient for the enzymatic DNA molecule to cleave the 
RNA substrate at any of the predetermined nucleotide sequences present in the RNA. In 
various embodiments, a source of ions is also provided - i.e. monovalent or divalent 
cations, or both. 

In one embodiment of the present invention, the amount of time necessary for 
the enzymatic DNA molecule to cleave the single-stranded nucleic acid has been 
predetermined. The amount of time is from about 1 minute to about 24 hours and will* 
vary depending upon the concentration of the reactants and the temperature of the 
reaction. Usually, this time period is from about 10 minutes to about 2 hours such that- 
the enzymatic DNA molecule cleaves the single-stranded nucleic acid at any of the 
predetermined nucleotide sequences present. 

The invention further contemplates that the nucleic acid cleaving conditions 
include the presence of a source of divalent cations (e.g., PbOAc) at a concentration of 
about 2-100 mM. Typically, the nucleic acid cleaving conditions include divalent cation 
at a concentration of about 2 mM to about 10 mM, with a concentration of about 5 mM 
being particularly preferred. 

The optimal cationic concentration to include in the nucleic acid cleaving 
conditions can be easily determined by determining the amount of single-stranded 
nucleic acid cleaved at a given cation concentration. One skilled in the art will 
understand that the optimal concentration may vary depending on the particular 
enzymatic DNA molecule employed. 

The present invention further contemplates that the nucleic acid cleaving 
conditions include a pH of about pH 6.0 to about pH 9.0. In one preferred embodiment, 
the pH ranges from about pH 6.5 to pH 8.0. In another preferred embodiment, the pH 
emulates physiological conditions, i.e., the pH is about 7.0-7.8, with a pH of about 7.5 



WO 96/17086 



PCT/US95/15580 



-28- 

being particularly preferred. 

One skilled in the art will appreciate that the methods of the present invention 
will work over a wide pH range so long as the pH used for nucleic acid cleaving is such 
that the enzymatic DNA molecule is able to remain in an active conformation. An 
enzymatic DNA molecule in an active conformation is easily detected by its ability to 
cleave single-stranded nucleic acid at a predetermined nucleotide sequence. 

In various embodiments, the nucleic acid cleaving conditions also include a 
variety of temperature ranges. As noted previously, temperature ranges consistent with 
physiological conditions are especially preferred, although temperature ranges consistent 
with industrial applications are also contemplated herein. In one embodiment, the 
temperature ranges from about 15°C to about 60°C. In another variation, the nucleic 
acid cleaving conditions include a temperature ranging from about 30 D C to about 56°C. 
In yet another variation, nucleic acid cleavage conditions include a temperature from 
about 35°C to about 50°C. In a preferred embodiment, nucleic acid cleavage conditions 
comprise a temperature range of about 37°C to about 42°C. The temperature ranges 
consistent with nucleic acid cleaving conditions are constrained only by the desired 
cleavage rate and the stability of that particular enzymatic DNA molecule at that 
particular temperature. 

In various methods, the present invention contemplates nucleic acid cleaving 
conditions including the presence of a polyamine. Polyamines useful for practicing the 
present invention include spermidine, putrescine, spermine and the like. In one 
variation, the polyamine is present at a concentration of about .01 mM to about 10 mM. 
In another variation, the polyamine is present at a concentration of about 1 mM to about 
10 mM. Nucleic acid cleavage conditions may also include the presence of polyamine at 
a concentration of about 2 mM to about 5 mM. In various preferred embodiments, the 
polyamine is spermidine. 

G. Vectors 

The present invention also features expression vectors including a nucleic acid 
segment encoding an enzymatic DNA molecule of the present invention situated within 
the vector, preferably in a manner which allows expression of that enzymatic DNA 
molecule within a target cell (e.g., a plant or animal cell). 

Thus, in general, a vector according to the present invention preferably includes 
a plasmid, cosmid, phagemid, virus, or phage vector. Preferably, suitable vectors 
comprise single-stranded DNA (ssDNA) - e.g., circular phagemid ssDNA. It should also 
be appreciated that useful vectors according to the present invention need not be 
circular. 

In one variation, nucleotide sequences flanking each of the additional enzymatic 
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DNA molecule-encoding sequences are preferably provided, which sequences may be 
recognized by the first enzymatic DNA molecule. The intervening or flanking sequences 
preferably comprise at least 1 nucleotide; more preferably, intervening or flanking 
sequences are about 2-20 nucleotides in length, with sequences of about 5-10 
nucleotides in length being particularly preferred. 

The addition of polynucleotide tails may also be useful to protect the 3" end of 
an enzymatic DNA molecule according to the present invention. These may be provided 
by attaching a polymeric sequence by employing the enzyme terminal transferase. 

A vector according to the present invention includes two or more enzymatic 
DNA molecules. In one embodiment, a first enzymatic DNA molecule has intramolecular 
cleaving activity and is able to recognize and cleave nucleotide sequences to release 
other enzymatic DNA sequences; i.e., it is able to function to "release" other enzymatic 
DNA molecules from the vector. For example, a vector is preferably constructed so that 
when the first enzymatic DNA molecule is expressed, that first molecule is able to 
cleave nucleotide sequences flanking additional nucleotide sequences encoding a second 
enzymatic DNA molecule, a third enzymatic DNA molecule, and so forth. Presuming v . 
said first enzymatic DNA molecule (i.e.. the "releasing" molecule) is able to cleave 
oligonucleotide sequences intramolecularly, the additional (e.g. second, third, and so on) 
enzymatic DNA molecules (i.e., the "released" molecules) need not possess 
characteristics identical to the "releasing" molecule. For example, in one embodiment. .= 
the "released" (i.e., the second, third, etc.) enzymatic DNA molecules are able to cleave 
specific RNA sequences, while the first ("releasing-) enzymatic DNA molecule has 
nuclease activity allowing it to liberate the "released" molecules. In another 
embodiment, the "released" enzymatic DNA molecule has amide bond-cleaving activity, 
while the first ("releasing") enzymatic DNA molecule has nuclease activity. 

Alternatively, the first enzymatic DNA molecule may be encoded on a separate 
vector from the second (and third, fourth, etc.) enzymatic DNA molecule(s) and may 
have intermodular cleaving activity. As noted herein, the first enzymatic DNA 
molecule can be a self-cleaving enzymatic DNA molecule (e.g., a deoxyribozyme), and 
the second enzymatic DNA molecule may be any desired type of enzymatic DNA 
molecule. When a vector is caused to express DNA from these nucleic acid sequences, 
that DNA has the ability under appropriate conditions to cleave each of the flanking 
regions, thereby releasing one or more copies of the second enzymatic DNA molecule. 
If desired, several different second enzymatic DNA molecules can be placed in the same 
cell or carrier to produce different deoxyribozymes. It is also contemplated that any one 
or more vectors may comprise one or more ribozymes or deoxyribozymes in any 
combination of "releasing" and "released" enzymatic nucleic acid molecules, as long as 
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such a combination achieves the desired result: the release of enzymatic nucleic acid 
molecules that are capable of cleaving predetermined nucleic acid sequences. 

Methods of isolating and purifying enzymatic DNA molecules of the present 
invention are also contemplated. In addition to the methods described herein, various 
5 purification methods (e.g. those using HPLC) and chromatographic isolation techniques 

are available in the art. See, e.g., the methods described in published international 
application no. WO 93/23569, the disclosures of which are incorporated herein by 
reference. 

It should also be understood that various combinations of the embodiments 
10 described herein are included within the scope of the present invention. Other features 

and advantages of the present invention will be apparent from the descriptions 
hereinabove, from the Examples to follow, and from the claims. 



EXAMPLES 

15 The following examples illustrate, but do not limit, the present invention. 

Example 1 

In Vitro Evolution of Enzvmatic DNA Molecules: 

An QverYiew 

In vitro selection and in vitro evolution techniques allow new catalysts to be 

20 isolated without a priori knowledge of their composition or structure. Such methods 

have been used to obtain RNA enzymes with novel catalytic properties. For example, 
ribozymes that undergo autolytic cleavage with lead cation have been derived from a 
randomized pool of tRNA^ molecules (Pan and Uhlenbeck, Biochemistry 31 : 3887-3895 
(1992)). Group I ribozyme variants have been isolated that can cleave DNA (Beaudry 

25 and Joyce, Science 257 : 635-641 (1992)) or that have altered metal dependence 

{Lehman and Joyce, Nature 361 : 182-185 (1993)). Starting with a pool of random RNA 
sequences, molecules have been obtained that catalyze a polymerase-like reaction 
(Bartel and Szostak, Science 261 : 1411-1418 (1993)). In the present example, 
refinement of specific catalytic properties of an evolved enzyme via alteration of the 

30 selection constraints during an in vitro evolution procedure is described. 

Darwinian evolution requires the repeated operation of three processes: (a) 
introduction of genetic variation; (b) selection of individuals on the basis of some fitness 
criterion; and (c) amplification of the selected individuals. Each of these processes can 
be realized in vitro (Joyce, Gene 82 : 83 (1989)). A gene can be mutagenized by 

35 chemical modification, incorporation of randomized mutagenic oligodeoxynucleotides, or 

inaccurate copying by a polymerase. (See, e.g., Cadwell and Joyce, in PCR Methods 
and Applications 2: 28-33 (1992); Cadwell and Joyce, PCR Methods and App lications 3 
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(SiiddI.) : S136-S140 (1994); Chu, et al., Virology 98 : 168 (1979); Shortle. et al., MfiltL. 
Enzvmol. 100 : 457 (1983); Myers, et al., fiance 229: 242 (1985); Matteucci, et al., 
N H n |Air Aciris Res. 11 : 3113 (1983); Wells, et al., Ke™ 34: 315 (1985); McNeil, et al., 
Mni f>ll Biol. 5 : 3545 (1985); Hutchison, et al., PNAS USA 83: 710 (1986); 
5 Derbyshire, et al., fiene 46 : 145 (1986); Zakour, et al., Nature 295: 708 (1982); 

Lehtovaara, et al.. Protein Eng. 2 : 63 (1988); Leung, et al.. Technique 1 : 1 1 (1989); 
Zhou, et al., N..H Acids Res. 19: 6052 (1991).) 

The gene product can be selected, for example, by its ability to bind a ligand or 
to carry out a chemical reaction. (See, e.g., Joyce, \sL (1989); Robertson and Joyce, 
10 Nature 344 : 467 (1990); Tuerk, et al., Science 249: 505 (1990).) The gene that 

corresponds to the selected gene product can be amplified by a reciprocal primer 
method, such as the polymerase chain reaction (PCR). (See, e.g., Saiki, et al., ScifiDfifi 
23Q: 1350-54 (1985); Saiki, et al., Science 239: 487-491 (1988).) 

Alternatively, nucleic acid amplification may be carried out using self-sustained 
1 5 sequence replication (3SR). (See, e.g., Guatelli, et al., PNAS USA 87: 1 874 (1990), the 

disclosures of which are incorporated by reference herein.) According to the 3SR 
method, target nucleic acid sequences may be amplified (replicated) exponentially in 
vitro under isothermal conditions by using three enzymatic activities essential to 
retroviral replication: (1) reverse transcriptase, (2) RNase H. and (3) a DNA-dependent 
20 RNA polymerase. By mimicking the retroviral strategy of RNA replication by means of. 

cDNA intermediates, this reaction accumulates cDNA and RNA copies of the original 
target. 

In summary, if one is contemplating the evolution of a population of enzymatic 
DNA molecules, a continuous series of reverse transcription and transcription reactions 

25 replicates an RNA target sequence by means of cDNA intermediates. The crucial 

elements of this design are (a) the oligonucleotide primers both specify the target and 
contain 5* extensions encoding the T7 RNA polymerase binding site, so that the 
resultant cDNAs are competent transcription templates; (b) cDNA synthesis can proceed 
to completion of both strands due to the degradation of template RNA in the 

30 intermediate RNA-DNA hybrid by RNase H; and (c) the reaction products (cDNA and 

RNA) can function as templates for subsequent steps, enabling exponential replication. 

If one is evolving enzymatic DNA molecules, various critical elements of this 
design are somewhat different, as disclosed in these Examples. For instance. (1) the 
oligonucleotide primers specify the target and are preferably "marked" or labeled in 

35 some fashion - e.g.. via biotinylation -- so the resultant competent template strands are 

easily identified; and (2) the in vitro selection procedure used preferably depends upon 
the identification of the most favorable release mechanism. 
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A major obstacle to realizing Darwinian evolution in vitro is the need to integrate 
mutation and amplification, both of which are genotype-related, with selection, which is 
phenotype-related. In the case of nucleic acid enzymes, for which genotype and 
phenotype are embodied in the same molecule, the task is simplified. 

A. Design of Enzvmatic DNA Molecules 

It is well known that single-stranded DNA can assume interesting tertiary 
structures. The structure of a "tDNA\ for example, closely resembles that of the 
corresponding tRNA. (See Paquette, et al., Eur. J. Biochem. 189 : 259-265 (1990).) 
Furthermore, it has been possible to replace as many as 31 of 35 ribonucleotides within 
a hammerhead ribozyme, while retaining at least some catalytic activity. (See Perreault, 
et al.. Nature 344: 565-567 (1990); Williams, et al., Proc. NatL Acad. Sci. USA 89 : 
918-921 (1992); Yang, et al., Biochemistry 31 : 5005-5009 (1992).) 

In vitro selection techniques have been applied to large populations of 
random-sequence DNAs, leading to the recovery of specific DNA "aptamers" that bind a 
target ligand with high affinity (Bock, et al.. Nature 355 : 564-566 (1992); Ellington & 
Szostak, Nature 355 : 850-852 (1992); Wyatt & Ecker, PNAS USA 91 : 1356-1360 
(1994)). Recently, two groups carried out the first NMR structural determination of an 
aptamer, a 1 5mer DNA that forms a G-quartet structure and binds the protein thrombin 
with high affinity (Wang, et al.. Biochemistry 32 : 1899-1904 (1993); Macaya, et al., 
PNAS USA 9Q: 3745-3749 (1993)). These findings were corroborated by an X-ray 
crystallographic analysis (Padmanabhan, et al,, J. Biol. Chem. 268 : 17651-17654 
(1993)). 

The ability to bind a substrate molecule with high affinity and specificity is a 
prerequisite of a good enzyme. In addition, an enzyme must make use of 
well-positioned functional groups, either within itself or a cofactor, to promote a 
particular chemical transformation. Furthermore, the enzyme must remain unchanged 
over the course of the reaction and be capable of operating with catalytic turnover. 
Some would add the requirement that it be an informational macromolecule, comprised 
of subunits whose specific ordering is responsible for catalytic activity. While these 
criteria are open to debate on both semantic arid chemical grounds, they serve to 
distinguish phenomena of chemical rate enhancement that range from simple solvent 
effects to biological enzymes operating at the limit of substrate diffusion (Albery & 
Knowles, Biochemistry 15 : 5631-5640 (1976)). 

As described in greater detail hereinbelow, we sought to develop a general 
method for rapidly obtaining DNA catalysts and DNA enzymes, starting from random 
sequences. As an initial target, we chose a reaction that we felt was well within the 
capability of DNA: the hydrolytic cleavage of an RNA phosphodiester, assisted by a 
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divalent metal cofactor. This is the same reaction that is carried out by a variety of 
naturally-occurring RNA enzymes, including the hammerhead and hairpin motifs. (See, 
e.g., Forster A.C. & Symons R.H., Cell 49 : 211-220 (1987); Uhlenbeck, Nature 3 2 8 : 
596-600 (1987); Hampel & Tritz, Rinrh«»mistrv 28: 4929-4933 (1989)). 
5 It has recently been shown that, beginning with a randomized library of tRNA 

molecules, one can obtain ribozymes that have Pb 2+ -dependent, site-specific RNA 
phosphoesterase activity at neutral pH (Pan & Uhlenbeck, Biochemistry 31 : 3887-3895 
(1992); Pan & Uhlenbeck, Nature 358 : 560-563 (1992)). This is analogous to the 
fortuitous self-cleavage reaction of yeast tRNA"** (Dirheimer & Werner, Bjochimie 54 : 

10 127-144 (1972)), which depends on specific coordination of a Pb 2+ ion at a defined site 

within the tRNA. (See Rubin & Sundaralingam, ,J , piomPl. Struct, Pvn. 1 : 639-646 
(1983); Brown, et al., Rinnhemistrv 24: 4785-4801 (1985).) 

As disclosed herein, our goals included the development of DNAs that could 
carry out Pb 2 *-dependent cleavage of a particular RNA phosphoester, initially presented 

15 within a short leader sequence attached to the 5' end of the DNA, and ultimately 

located within a separate molecule that could be cleaved in an intermolecular fashion 
with rapid catalytic turnover. These goals were successfully achieved, as described ... 
further below. 

No assumptions were made as to how the DNA would interact with the target 
20 phosphoester and surrounding nucleotides. Beginning with a pool of approximately 10^ 

random 50mer sequences, in vitro selection was allowed to run its course. After five 
rounds of selection carried out over four days, the population as a whole had attained 
the ability to cleave the target phosphoester in the presence of 1 mM Pb 2+ at a rate of 
about 0.2 min \ This is an approximately 10 s -fold increase compared to the 
25 spontaneous rate of cleavage under the same reaction conditions. 

Individuals were isolated from the population, sequenced, and assayed for 
catalytic activity. Based on this information, the reaction was converted to an 
intermolecular format and then simplified to allow site-specific cleavage of a 19mer 
substrate by a 38mer DNA enzyme, in a reaction that proceeds with a turnover rate of 1 
30 min ' at 23"C and pH 7.0 in the presence of 1 mM PbOAc. 

B. In Vitro Se ction Scheme 

A starting pool of approximately 10'* single-stranded DNA molecules was 
generated, all of which contain a 5* biotin moiety, followed successively by a fixed 
domain that includes a single ribonucleotide, a potential catalytic domain comprised of 
35 50 random deoxyribonucleotides, and a second fixed domain that lay at the 3' terminus 

(Fig. 1). 

The pool was constructed by a nested PCR (polymerase chain reaction) 
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technique, beginning with synthetic DNA that contained 50 random nucleotides flanked 
by primer binding sites. The nested PCR primer was a 5'-biotinylated synthetic 
oligodeoxynucleotide with a 3'-terminal adenosine ribonucleotide. 

Ribonucleotide-terminated oligonucleotides efficiently prime template-directed elongation 
in the context of the PCR (L.E. Orgel, personal communication), in this case giving rise 
to an extension product that contains a single embedded ribonucleotide. 

Figure 1 illustrates a selective amplification scheme for isolation of DNAs that 
cleave a target RNA phosphoester. Double-stranded DNA containing a stretch of 50 
random nucleotides is amplified via PCR, employing a 5'-biotinylated DNA primer {e.g., 
primer 3 — 3a or 3b) terminated at the 3* end by an adenosine ribonucleotide 
(represented by the symbol n N" or n rA", wherein both N and rA represent an adenosine 
ribonucleotide). This primer is extended by Tag polymerase to yield a DNA product that 
contains a single embedded ribonucleotide. The resulting double-stranded DNA is 
immobilized on a streptavidin matrix and the unbiotinylated DNA strand is removed by 
washing with 0.2 N NaOH. After re-equilibrating the column with a buffered solution, 
the column is washed with the same solution with added 1 mM PbOAc. DNAs that 
undergo Pb 2 + -dependent self-cleavage are released from the column, collected in the 
eluant, and amplified by PCR. The PCR products are then used to initiate the next round 
of selective amplification. 

The PCR products were passed over a streptavidin affinity matrix, resulting in 
noncovalent attachment of the B'-biotinylated strand of the duplex DNA. The 
nonbiotinylated strand was removed by brief washing with 0.2 N NaOH, and the bound 
strand was equilibrated in a buffer containing 0.5 M NaCI, 0.5 M KCI, 50 mM MgCI 2 , 
and 50 mM HEPES (pH 7.0) at 23°C. Next, 1 mM PbOAc was provided in the same 
buffer, allowing Pb 2 * -dependent cleavage to occur at the target phosphoester, thereby 
releasing a subset of the DNAs from the streptavidin matrix. In principle, an individual 
DNA might facilitate its own release by various means, such as disruption of the 
interaction between biotin and streptavidin or cleavage of one of the 
deoxyribonucleotide linkages. It was felt that cleavage of the ribonucleoside 3 '-O-P 
bond would be the most likely mechanism for release, based on the relative lability of 
this linkage, and that Pb 2 + -dependent hydrolytic cleavage would allow release to occur 
most rapidly. In principle, however, the in vitro selection procedure should identify the 
most favorable release mechanism as well as those individuals best able to carry out 
that mechanism. 

DNA molecules released from the matrix upon addition of Pb 2 * were collected in 
the eluant, concentrated by precipitation with ethanol, and subjected to nested PCR 
amplification. As in the construction of the starting pool of molecules, the first PCR 
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amplification utilized primers that flank the random region (primers 1 and 2) and the 
second utilized a 5'-biotinylated primer {primer 3b) that has a 3 -terminal riboadenylate, 
thereby reintroducing the target RNA phosphoester. The entire selective amplification 
procedure requires 3-4 hours to perform. 
5 The molecules are purified in three ways during each round of this procedure: 

first, following PCR amplification, by extracting twice with phenol and once with 
chloroform / isoamyl alcohol, then precipitating with ethanol; second, following 
attachment of the DNA to streptavidin, by washing away all the nonbiotinylated 
molecules under strongly denaturing conditions; and third, following elution with Pb J 
10 by precipitating with ethanol. There is no gel electrophoresis purification step, and thus 

no selection pressure constraining the molecules to a particular length. 
C. Ruction of r.atalvtic DNA 

We carried out five successive rounds of in vitro selection, progressively 
decreasing the reaction time following addition of Pb 2 + in order to progressively increase 
1 5 the stringency of selection. During rounds 1 though 3, the reaction time was 1 hour; 

during round 4, the reaction time was 20 minutes; and during round 5, it was 1 minute. 
The starting pool of single-stranded DMAs, together with the population of molecules ? 
obtained after each round of selection, was assayed for self-cleavage activity under 
conditions identical to those employed during in vitro selection (see Fig. 2). 
20 For this assay, the molecules were prepared with a 5'- 32 P rather than a 5'-biotin 

moiety, allowing detection of both the starting material and the 5' cleavage product. 7 
Following a 5-minute incubation, there was no detectable activity in the initial pool (GO) 
or in the population obtained after the first and second rounds of selection. DNAs 
obtained after the third round (G3> exhibited a modest level of activity; this activity 
25 increased steadily, reaching approximately 50% self-cleavage for the DNAs obtained 

after the fifth round of selection (G5). Cleavage was detected only at the target 
phosphoester, even after long incubation times. This activity was lost if Pb 2 + was 
omitted from the reaction mixture. 

Figure 2 illustrates the self-cleavage activity of the starting pool of DNA (GO) 
30 and populations obtained after the first through fifth rounds of selection (G1 - G5>. 

Reaction mixtures contained 50 mM MgCI 2 , 0.5 M NaCI. 0.5 M KCI, 50 mM HEPES <pH 
7.0 at 23°C). and 3 nM [5"- 32 P]-labeled DNA, incubated at 23°C for 5 min either in the 
presence or in the absence of 1 mM PbOAc. The symbol Pre represents 108-nucleotide 
precursor DNA (SEQ ID NO 4); Civ. 28-nucleotide 5'-cleavage product (SEQ ID NO 5); 
35 and M, primer 3a (SEQ ID NO 6), corresponding in length to the 5'-cleavage product. 

The 28-nucleotide 5" cleavage product (Civ) illustrated preferably has the 
sequence 5'-GGGACGAATTCTAATACGACTCACTATN-3', wherein "N" represents 
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adenosine ribonucleotide with an additional 2\ 3'-cyclic phosphate on the 3' end (SEQ 
ID NO 5). In alternative embodiments, "N N represents adenosine ribonucleotide with an 
additional 2' or 3* phosphate on the 3* end of the molecule. 

In Figure 2, the "GO" lane "Pre" band comprises a sampling of 108-nucleotide 
5 precursor DNAs that each include 50 random nucleotides. Therefore, any given "Pre" 

sampling will contain a wide variety of precursor DNAs, and each sampling will likely 
differ from previous and subsequent samplings. The "G1 w through "G5 n lanes contain 
"Pre" bands that are increasingly enriched for catalytic DNA molecules, but still contain 
a large number of different DNA sequences (i.e., differing in the 50 nucleotide 

10 randomized domain). A sample of these different sequences from "G5 Pre" DNA is 

provided in Figure 3. 

Shotgun cloning techniques were employed to isolate individuals from the G5 
population; the complete nucleotide sequences of 20 of these subclones were then 
determined (see Fig. 3). (Also see, e.g., Cadwell and Joyce, in PCR Methods and 

1 5 Applications 2: 28-33 (1 992); Cadwell and Joyce, PCR Methods and Applications 3 

(SudoI.): S136-S140 (1994).) Of the 20 sequences, five were unique, two occurred 
twice, one occurred three times, and one occurred eight times. All of the individual 
variants share common sequence elements within the 50-nucleotide region that had 
been randomized in the starting pool of DNA. They all contain two presumed template 

20 regions, one with complementarity to a stretch of nucleotides that lies just upstream 

from the cleavage site and the other with complementarity to nucleotides that lie at 
least four nucleotides downstream. Between these two presumed template regions lies 
a variable domain of 1-11 nucleotides, followed by the fixed sequence S'-AGCGS*, then 
a second variable domain of 3-8 nucleotides, and finally the fixed sequence 5'-CG-3* or 

25 5'-CGA-3\ Nucleotides that tie outside of the two presumed template regions are highly 

variable in both sequence and length. In all of the sequenced subclones, the region 
corresponding to the 50 initially-randomized nucleotides remains a total of 50 
nucleotides in length. 

Figure 3 illustrates the sequence alignment of individual variants isolated from 

30 the population after five rounds of selection. The fixed substrate domain (5*- 

GGGACGAATTCTAATACGACTCACTATrAGGAAGAGATGGCGAC-3\ or 5*- 
GGGACGAATTCTAATACGACTCACTATNGGAAGAGATGGCGAC-3', where N represents 
adenosine ribonucleotide) (SEQ ID NO 13) is shown at the top, with the target 
riboadenylate identified with an inverted triangle. Substrate nucleotides that are 

35 commonly involved in presumed base-pairing interactions are indicated by a vertical bar. 

Sequences corresponding to the 50 initially-randomized nucleotides are aligned 
antiparallel to the substrate domain. All of the variants are 3*-terminated by the fixed 
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sequence 5'-CGGTAAGCTTGGCAC-3' (SEQ ID NO 1) ("primer site"; not shown). 
Nucleotides within the initially-randomized region that are presumed to form base pairs 
with the substrate domain are indicated on the right and left sides of the Figure; the 
putative base-pair-forming (or substrate binding) regions of the enzymatic DNA 
molecules are individually boxed in each sequence shown. The highly-conserved 
nucleotides within the putative catalytic domain are illustrated in the two boxed 
columns. 

While it is anticipated that additional data will be helpful in constructing a 
meaningful secondary structural model of the catalytic domain, we note that, like the 
hammerhead and hairpin ribozymes. the catalytic domain of our enzymatic DNA 
molecules appears to contain a conserved core flanked by two substrate binding regions 
(or recognition domains) that interact with the substrate through base-pairing 
interactions. Similar to the hammerhead and hairpin ribozymes. the catalytic DNAs also 
appear to require a short stretch of unpaired substrate nucleotides -- in this case 
5'-GGA-3' -- between the two regions that are involved in base pairing. 

It was also interesting to note that each of the nine distinct variants exhibited a 
different pattern of presumed complementarity with the substrate domain. In some 
cases, base pairing was contiguous, while in others it was interrupted by one or more 
noncomplementary pairs. The general tendency seems to be to form tighter interaction 
with the nucleotides that lie upstream from the cleavage site compared to those that lie 
downstream. Binding studies and site-directed mutagenesis analysis should enable us to 
gain further insights and to further substantiate this conjecture. 

In order to gain further insight into the sequence requirements for catalytic 
function, the self-cleavage activity of six of the nine variants was tested and evaluated 
under the within-described selection conditions (see Fig. 3). Not surprisingly, the 
sequence that occurred in eight of the 20 subclones proved to be the most reactive, 
with a first-order rate constant of 1.4 min \ All of the studied variants were active in 
the self-cleavage assay and all gave rise to a single 5'-labeled product corresponding to 
cleavage at the target RNA phosphoester. 

The dominant subclone was further analyzed under a variety of reaction 
conditions. Its self-cleavage activity was dependent on Pb J+ but was unaffected if 
Mg 2 * was omitted from the reaction mixture. There was a requirement for a 
monovalent cation as well, which can be met by either Na + or K*. The reaction rate 
increased linearly with increasing concentration of monovalent cation over the range of 
0 - 1.0 M (r = 0.998). Other variables that may affect the reaction, such as pH, 
temperature, and the presence of other divalent metals, are in the process of being 
evaluated further. 
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Example 2 



Materials and Methods 



A. Oligonucleotides and Oligonucleotide Analogs 

Synthetic DNAs and DNA analogs were purchased from Operon Technologies. 
The 19-nucleotide substrate, 5*-pTCACTATrAGGAAGAGATGG-3' (or 5'- 
pTCACTATNGGAAGAGATGG-3\ wherein "N" represents adenosine ribonucleotide) 
(SEQ ID NO 7), was prepared by reverse-transcriptase catalyzed extension of 
5'-pTCACTATrA-3' (or 5'-pTCACTATN-3\ wherein W N" represents adenosine 
ribonucleotide) (SEQ ID NO 8), as previously described (Breaker, Banerji, & Joyce, 
Biochemistry 33 : 1 1980-1 1986 (1994)), using the template 

5*-CCATCTCTTCCTATAGTGAGTCCGGCTGCA-3' (SEQ ID NO 9). Primer 3, 5'- 
GGGACGAATTCTAATACGACTCACTATrA-3' (or 5'- 

GGGACGAATTCTAATACGACTCACTATN-3', wherein W N" represents adenosine 
ribonucleotide) (SEQ ID NO 6), was either 5'-labeled with [y- 32 P]ATP and T4 
polynucleotide kinase (primer 3a) or S'-thiophosphorylated with ly-S]ATP and T4 
polynucleotide kinase and subsequently biotinylated with A/-iodoacetyl-/V'- 
biotinylhexylenediamine (primer 3b). 

B. PNA Pool Preparation 

The starting pool of DNA was prepared by PCR using the synthetic oligomer 
S'-GTCCCAAGCTTACCG-Nso-GTCGCCATCTCTTCC-S' (SEQ ID NO 4), where N is an 
equimolar mixture of G, A, T and C. A 2-ml PCR, containing 500 pmoles of the 
randomized oligomer, 1,000 pmoles primer 1 (5*-GTGCCAAGCTTACCG-3', SEQ ID NO 
10), 500 pmoles primer 2 

( 5 ' -CTGC AG AATTCTAATACG ACTC ACT ATAGG AAG AG ATGGCG AC-3 ' , SEQ ID NO 11), 
500 pmoles primer 3b, 10 ^Ci (ct- 32 P]dATP, and 0.2 U m' 1 Taq DNA polymerase, was 
incubated in the presence of 50 mM KCI, 1 .5 mM MgCI 2 , 10 mM Tris-HCI (pH 8.3 at 
23°C), 0.01% gelatin, and 0.2 mM of each dNTP for 1 min at 92°C, 1 min at 50°C, and 
2 min at 72°C, then 5 cycles of 1 min at 92°C, 1 min at 50°C, and 1 min at 72°C. The 
resulting mixture was extracted twice with phenol and once with chloroform / isoamyl 
alcohol, and the DNA was isolated by precipitation with ethanol. 

C. In Vitro Selection 

The starting pool of DNA was resuspended in 500 m*- of buffer A (1 M NaCI and 
50 mM HEPES (pH 7.0 at 23°Q) and was passed repeatedly over a streptavidin column 
(AffiniTip Strep 20, Genosys, The Woodlands, TX). The column was washed with five 
100-jil volumes of buffer A, followed by five 100-^1 volumes of 0.2 N NaOH, then 
equilibrated with five 100-^1 volumes of buffer B (0.5 M NaCI, 0.5 M KCI, 50 mM 
MgCI 2 , and 50 mM HEPES (pH 7.0 at 23°C)). The immobilized single-stranded DNA was 
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eluted over the course of 1 hr with three 2(M volumes of buffer B with added 1 mM 
PbOAc. The entire immobilization and elution process was conducted at 23°C. The 
eluant was collected in an equal volume of buffer C (50 mM HEPES <pH 7.0 at 23°C) 
and 80 mM EDTA) and the DNA was precipitated with ethanol. 
5 The resulting DNA was amplified in a 100-mL PCR containing 20 pmoles primer 

1, 20 pmoles primer 2, 0.05 U tx\ 1 Taq polymerase, 50 mM KCI, 1.5 mM MgCI 2 , 10 mM 
Tris-HCI (pH 8.3 at 23°C), 0.01 % gelatin, and 0.2 mM of each dNTP for 30 cycles of 
10 sec at 92°C, 30 sec at 50°C. and 30 sec at 72°C. The reaction products were 
extracted twice with phenol and once with chloroform / isoamyl alcohol, and the DNA 

1 0 was recovered by precipitation with ethanol. Approximately 4 pmoles of the amplified 

DNA was added to a second, nested PCR containing 100 pmoles primer 1, 100 pmoles 
primer 3b, 20 ^Ci |a- 32 P]dATP, and 0.1 U m> 1 Taq polymerase, in a total volume of 200 
M L that was amplified for 10 cycles of 1 min at 92°C, 1 min at 50°C, and 1 min at 
72 0 C. The PCR products were once more extracted and precipitated, and the resulting 

1 5 DNA was resuspended in 50 mL buffer A, then used to begin the next round of 

selection. 

The second and third rounds were carried out as above, except that the nested 
PCR at the end of the third round was performed in a 100-^1 volume. During the fourth 
round, the elution time following addition of Pb 2+ was reduced to 20 min (two 20-^L ... 
20 elution volumes) and only half of the recovered DNA was used in the first PCR, which , 

involved only 1 5 temperature cycles. During the fifth round, the elution time was 
reduced to 1 min (two 20-^L elution volumes) and only one-fourth of the recovered DNA 
was used in the first PCR, which involved 15 temperature cycles. DNA obtained after 
the fifth round of selection was subcloned and sequenced, as described previously 
25 (Tsang & Joyce, Rinrh»mistrv 33: 5966-5973 (1994)). 

D. Kinetic Ana lysis of Cutalvtir: DNAs 

Populations of DNA and various subcloned individuals were prepared with a 
5'- 32 P label by asymmetric PCR in a 25-/il reaction mixture containing 10 pmoles primer 
3a, 0.5 pmoles input DNA, and 0.1 U ^ Taq polymerase, under conditions as described 
30 above, for 10 cycles of 1 min at 92°C, 1 min at 50°C, and 1 min at 72°C. The 

resulting [S'-^PHabeled amplification products were purified by electrophoresis in a 

10% polyacrylamide / 8 M gel. 

Self-cleavage assays were carried out following preincubation of the DNA in 
buffer B for 10 min. Reactions were initiated by addition of PbOAc to 1 mM final 
35 concentration and were terminated by addition of an equal volume of buffer C. Reaction 

products were separated by electrophoresis in a 10% polyacrylamide / 8 M gel. Kinetic 
assays under multiple-turnover conditions were carried out in buffer B that included 50 
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A*g ml* 1 BSA to prevent adherence of material to the vessel walls. Substrate and enzyme 
molecules were preincubated separately for 5 min in reaction buffer that lacked Pb 2 \ 
then combined, and the reaction was initiated by addition of PbOAc to a final 
concentration of 1 mM. 

Example 3 

Evolution of Deoxvribozvmes 
That Cleave Intermolecularlv 

A. Conversion to an Intermolecular Format 

Based on the variable pattern of presumed base-pairing interactions between the 
catalytic and substrate domains of the studied variants, it was felt that it would be 
reasonably straightforward to convert the DNA-catalyzed reaction to an intermolecular 
format. In doing so, we wished to simplify the two substrate-binding regions of the 
catalyst so that each would form an uninterrupted stretch of 7-8 base pairs with the 
substrate. In addition, we wished to provide a minimal substrate, limited to the two 
base-pairing regions and the intervening sequence S'-GGA-S* (Fig. 4A). 

Figures 4A and 4B illustrate DNA-catalyzed cleavage of an RNA phosphoester in 
an intermolecular reaction that proceeds with catalytic turnover. Figure 4A is a 
diagrammatic representation of the complex formed between the 1 9mer substrate and 
38mer DNA enzyme. The substrate contains a single adenosine ribonucleotide ("rA" or 
"N", adjacent to the arrow), flanked by deoxyribonucleotides. The synthetic DNA 
enzyme is a 38-nucleotide portion of the most frequently occurring variant shown in Fig. 
3. Highly-conserved nucleotides located within the putative catalytic domain are 
"boxed". As illustrated, one conserved sequence is "AGCG", while another is "CG" 
(reading in the 5'-3' direction). 

Figure 4B shows an Eadie-Hofstee plot used to determine K m (negative slope) 
and V ma)( (y-intercept) for DNA-catalyzed cleavage of [5'- 32 P|-labeled substrate under 
conditions identical to those employed during in vitro selection. Initial rates of cleavage 
were determined for reactions involving 5 nM DNA enzyme and either 0.125, 0.5, 1, 2, 
or 4 substrate. 

In designing the catalytic domain, we relied heavily on the composition of the 
most reactive variant, truncating by two nucleotides at the 5' end and 1 1 nucleotides at 
the 3* end. The 15 nucleotides that lay between the two template regions were left 
unchanged and a single nucleotide was inserted into the 3' template region to form a 
continuous stretch of nucleotides capable of forming base pairs with the substrate. The 
substrate was simplified to the sequence S'- TCACTATrA • GGA AGAGATG G-3' (or 
S'- TCACTATN • GGA AGAGATG G-3' . wherein "N" represents adenosine ribonucleotide) 
(SEQ ID NO 12), where the underlined nucleotides correspond to the two regions 
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involved in base pairing with the catalytic DNA molecule. 

The simplified reaction system, employing a 38mer catalytic DNA molecule 
(catalyst) comprised entirely of deoxyribonucleotides and a 19mer substrate containing a 
single ribonucleotide embedded within an otherwise all-DNA sequence, allows efficient 
5 DNA-catalyzed phosphoester cleavage with rapid turnover. Over a 90-minute incubation 

in the presence of 0.01 catalyst and 1 substrate, 46% of the substrate is 
cleaved, corresponding to 46 turnovers of the catalyst. A preliminary kinetic analysis of 
this reaction was carried out, evaluated under multiple-turnover conditions. The DNA 
catalyst exhibits Michaelis-Menten kinetics, with values for k cat and K m of 1 min' 1 and 2 
0 a«M, respectively (see Fig. 4B). The value for K m is considerably greater than the 

expected dissociation constant between catalyst and substrate based on Watson-Crick 
interactions. The substrate was incubated under identical reaction conditions (but in the 
absence of the catalyst); a value for k uncat of 4 * 10 6 min 1 was obtained. This is 
consistent with the reported value of 5 x 10 3 min 1 for hydrolysis of the more labile 

1 5 1-nitrophenyl-1 ,2-propanediol in the presence of 0.5 mM Pb 2+ at pH 7.0 and 37°C 

(Breslow & Huang, PNAS USA 88 : 4080-4083 (1991)). 

It is now presumed that the phosphoester cleavage reaction proceeds via a 
hydrolytic mechanism involving attack by the ribonucleoside 2 '-hydroxyl on the vicinal 
phosphate, generating a 5' product with a terminal 2 , {3 , )-cyclic phosphate and 3' 

20 product with a terminal S'-hydroxyl. In support of this mechanism, the 3'-cleavage 

product is efficiently phosphorylated with T4 polynucleotide kinase and [y- 32 P]ATP, 
consistent with the availability of a free 5'-hydroxyl (data not shown). 
B. PiSCUSS'pn 

After five rounds of in vitro selection, a population of single-stranded DNA 
25 molecules that catalyze efficient Pb 2+ -dependent cleavage of a target RNA phosphoester 

was obtained. Based on the common features of representative individuals isolated 
from this population, a simplified version of both the catalytic and substrate domains 
was constructed, leading to a demonstration of rapid catalytic turnover in an 
intermolecular context. Thus the 38mer catalytic domain provides an example of a DNA 
30 enzyme, or what might be termed a "deoxyribozyme". 

Referring to this molecule as an enzyme, based on the fact that it is an 
informational macromolecule capable of accelerating a chemical transformation in a 
reaction that proceeds with rapid turnover and obeys Michaelis-Menten kinetics, may 
not satisfy everyone's notion of what constitutes an enzyme. Some might insist that an 
35 enzyme, by definition, must be a polypeptide. If, however, one accepts the notion of an 

RNA enzyme, then it seems reasonable to adopt a similar view concerning DNA 
enzymes. Considering how quickly we were able to generate this molecule from a pool 
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of random-sequence DNAs, we expect that many other examples of synthetic DNA 
enzymes will appear in the near future. 

The Pb 2+ -dependent cleavage of an RNA phosphoester was chosen as an initial 
target for DNA catalysis because it is a straightforward reaction that simply requires the 
proper positioning of a coordinated Pb 2+ -hydroxyl to facilitate deprotonation of the 2 ' 
hydroxy! that lies adjacent to the cleavage site. (See, e.g., Pan, et al., in The RNA 
World . Gesteland & Atkins (eds.), pp. 271-302, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY (1993).) Pb 2+ is known to coordinate to the N7 position of 
purines, the 06 position of guanine, the 04 position of uracil, and the N3 position of 
cytosine (Brown, et al., Nature 303 : 543-546 (1993)). Thus, the differences in sugar 
composition and conformation of DNA compared to RNA seemed unlikely to prevent 
DNA from forming a well-defined Pb 2 + -binding pocket. 

A substrate that contains a single ribonucleotide within an otherwise all-DNA 
sequence was chosen because it provided a uniquely favored site for cleavage and 
insured that any resulting catalytic activity would be attributable solely to DNA. 
Substrate recognition appears to depend on two regions of base-pairing interactions 
between the catalyst and substrate. However, the unpaired substrate nucleotides, 
5*-GGA-3\ that lie between these two regions may play an important role in substrate 
recognition, metal coordination, or other aspects of catalytic function. 

It is further anticipated that an all-RNA molecule, other RNA-DNA composites, 
and molecules containing one or more nucleotide analogs may be acceptable substrates. 
As disclosed herein, the within-described in vitro evolution procedures may successfully 
be used to generate enzymatic DNA molecules having the desired specificities; further 
analyses along these lines are presently underway. 

In addition, studies to determine whether the presumed base-pairing interactions 
between enzyme and substrate are generalizable with respect to sequence are in 
progress, using the presently-described methods. The within-disclosed Pb 2 *-dependent 
deoxyribozymes may also be considered model compounds for exploring the structural 
and enzymatic properties of DNA. 

The methods employed in the present disclosure for the rapid development of 
DNA catalysts will have considerable generality, allowing us to utilize other cofactors to 
trigger the cleavage of a target linkage attached to a potential catalytic domain. In this 
regard, the development of Mg 2 + -dependent DNA enzymes that specifically cleave 
target RNAs under physiological conditions is of interest, as is the development of DNA 
enzymes that function in the presence of other cations (see Example 4). Such 
molecules will provide an alternative to traditional antisense and ribozyme approaches 
for the specific inactivation of target mRNAs. 
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DNA thus joins RNA and protein on the list of biological macromolecules that are 
capable of exhibiting enzymatic activity. The full extent of DNA's catalytic abilities 
remains to be explored, but these explorations should proceed rapidly based on in vitro 
selection methods such as those employed in this study. 

DNA enzymes offer several important advantages compared to other 
macromolecular catalysts. First, they are easy to prepare, in an era when most 
laboratories have access to an automated DNA synthesizer and the cost of DNA 
phosphoramidites has become quite modest. Second, they are very stable compounds, 
especially compared to RNA, thus facilitating their use in biophysical studies. Third, we 
expect that they can be adapted to therapeutic applications that at present make use of 
antisense DNAs that lack RNA-cleavage activity. In vitro selection could be carried out 
with DNA analogs, including compounds that are nuclease resistant such as 
phosphorothioate-containing DNA, so long as these analogs can be prepared in the form 
of a deoxynucleoside 5'-triphosphate and are accepted as a substrate by a 
DNA-dependent DNA polymerase. Finally, DNA enzymes offer a new window on our , 
understanding of the macromolecular basis of catalytic function. It will be interesting, 
for example, to carry out comparative analyses of protein-, RNA-, and DNA-based 
enzymes that catalyze the same chemical transformation. 

Example 4 
r>th B r Family r»f Patalvtic DNAs 
A starting pool of DNA was prepared by PCR essentially as described in Example 
2.B. above, except that the starting pool of DNA comprised molecules containing 40 
random nucleotides. Thus, the starting pool of DNA described herein was prepared by 
PCR using the synthetic oligomer 5 ' GGG ACG AAT TCT AAT ACG ACT CAC TAT rA 
GG AAG AGA TGG CGA CAT CTC N^GT GAC GGT AAG CTT GGC AC 3 ' (SEO. ID NO 
23), where N is an equimolar mixture of G, A, T and C, and where the DNA molecules 
were selected for the ability to cleave the phosphoester following the target rA. (See 
Figure 6A, also.) 

Selective amplification was carried out in the presence of either Pb 2 \Zn 2 *,Mn 2 * , 
or Mg 2 \ thereby generating at least four "families" of catalytic DNA molecules. As 
illustrated in Figure 5, catalytic DNA molecules demonstrating specific activity were 
generated in the presence of a variety of cations. 

Figure 5 is a photographic representation showing a polyacrylamide gel 
demonstrating specific endoribonuclease activity of four families of selected catalytic 
DNAs. Selection of a Pb 2 + -dependent family of molecules was repeated in a side-by- 
side fashion as a control. In each group of three lanes, the first lane shows the lack of 
activity of the selected population in the absence of the metal cation, the second lane 



WO 96/17086 PCT/US95/15580 



-44- 

shows the observed activity in the presence of the metal cation, and the third lane 
shows the lack of activity of the starting pool (GO). At present, the order of reactivity is 
observed to be Pb 2+ >Zn 2+ >Mn 2 + >Mg 2+ , mirroring the pK a of the corresponding metal- 
hydroxide. 

5 After either five (G5) or six (G6) rounds of selective amplification in the presence 

of the preselected divalent cation, the desired endonuclease activity was obtained. The 
following description of selective amplification in the presence of Mg 2+ is intended to be 
exemplary. 

Six rounds of in vitro selective amplification were carried out, following the 
10 method described in Example 2 hereinabove, except that the divalent metal used was 1 

mM Mg 2+ rather than 1 mM Pb 2 + . (See also Breaker and Joyce, Chem. & Biol. 1 : 
223-229 (1994), incorporated by reference herein, which describes essentially the same 
procedure.) 

Individual clones were isolated following the sixth round, and the nucleotide 
1 5 sequence of 24 of these clones was determined. All of the sequences began with: 5 ' 

GGG ACG AAT TCT AAT ACG ACT CAC TAT rA GG AAG AGA TGG CGA CA {SEQ ID 
NO 23 from position 1 to 44) and ended with: CGG TAA GCT TGG CAC 3 ' (SEQ ID 
NO 23 from position 93 to 107). 

The segment in the middle, corresponding to TCTC N 40 GTGA (SEQ ID NO 23 
20 from position 45 to 92) in the starting pool, varied as follows: 

(13) CCG CCC ACC TCT TTT ACG AGC CTG TAC GAA ATA GTG CTC TTG 

TTA GTA T (SEQ ID NO 24) 
(5) TCT C TT CAG CGA TGC ACG CTT GTT TTA ATG TTG CAC CCA TGT 
25 IAG TGA (SEQ ID NO 25) 

(2) TCT CAT CAG CGA TTG AAC CAC TTG GTG GAC AGA CCC ATG TTA 

GTG A (SEQ ID NO 26) 
(1 ) CCG CCC ACC TCT TTT ACG AGC CTG TAC GAA ATA GTG TTC TTG 
TTA GTA T (SEQ ID NO 27) 
30 (1 ) CCG CCC ACC TCT TTT ACG AGC CTG TAC GAA ATA GTG CTC TCG 

TTA GTA T (SEQ ID NO 28) 
(1) TCT CAG ACT TAG TCC ATC ACA CTC TGT GCA TAT GCC TGC TTG 

ATG TGA (SEQ ID NO 29) 
(1 ) -CT CTC ATC TGC TAG CAC GCT CGA ATA GTG TCA GTC GAT GTG A 
35 (SEQ ID NO 30). 



WO 96/17086 WCT/US95/15580 



-45- 

The initial number in parentheses indicates the number of clones having that 
particular sequence. Note that some mutations {highlighted in bold type) occurred at 
nucleotide positions other than those that were randomized initially. 

The second sequence listed above (i.e., SEQ ID NO 25), which occurred in 5 of 
5 24 clones, was chosen as a lead (i.e. principal) compound for further study. Its 

cleavage activity was measured in the presence of a 1 mM concentration of various 
divalent metals and 1 M NaCI at pH 7.0 and 23°C: 

metal Kt» (miir 1 ) 

10 none n -d- 

Mg 2+ 2.3x10 3 

Mn 2+ 6.8 x 10" 3 

Zn 2 * 4.2 x10 2 

Pb 2+ 1.1 x 10 2 

15 

Thus, the lead compound is active in the presence of all four divalent metals, 
even though it was selected for activity in the presence of Mg 2+ . Conversely, DNA ? 
molecules that were selected for activity in the presence of Mn 2 \ Zn 2 \ or Pb 2 + did not 
show any activity in the presence of Mg 2+ . 
20 In addition, the population of DNAs obtained after six rounds of in vitro selection 

in the presence of Mg 2 \ when prepared as all-phosphorothioate-containing DNA 
analogs, showed Mg 2 + -dependent cleavage activity at an observed rate of -10 3 min 1 . 
The phosphorothioate-containing analogs were prepared enzymatically so as to have an 
fl P configuration at each stereocenter. Such compounds are relatively resistant to 
25 degradation by cellular nucleases compared to unmodified DNA. 

The lead compound was re-randomized at 40 nucleotide positions (underlined), 
introducing mutations at a frequency of 1 5% (5% probability of each of the three 
possible base substitutions). The re-randomized population was subjected to seven 
additional rounds of in vitro selection. During the last four rounds, molecules that were 
30 reactive in the presence of 1 mM Pb 2 * were removed from the population before the 

remainder were challenged to react in the presence of 1 mM Mg 2 *. Individual clones 
were isolated following the seventh round and the nucleotide sequence of 14 of these 
clones was determined. All of the sequences began with: 5 ' GGG ACG AAT TCT AAT 
ACG ACT CAC TAT rA GG AAG AGA TGG CGA CAT CTC (SEQ ID NO 23, from position 
35 1 to 48), and ended with: GTG ACG GTA AGC TTG GCA C 3 (SEQ ID NO 23, from 

position 89 to 107). 

The segment in the middle, corresponding to the 40 partially-randomized 
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positions (N^, SEQ ID NO 23, from position 49 to 88), varied as follows: 

(4) TAC AGC GAT TCA CCC TTG TTT AAG GGT TAC ACC CAT GTT A 
{SEQ ID NO 31) 

(2) ATC AGC GAT TAA CGC TTG TTT CAA TGT TAC ACC CAT GTT A 
(SEQ ID NO 32) 

(2) TTC AGC GAT TAA CGC TTA TTT TAG CGT TAC ACC CAT GTT A 
(SEQ ID NO 33) 

(1 ) ATC AGC GAT TCA CCC TTG TTT TAA GGT TGC ACC CAT GTT A 
(SEQ ID NO 34) 

(1 ) ATC AGC GAT TCA CCC TTG TTT AAG CGT TAC ACC CAT GTT G 
(SEQ ID NO 35) 

(1 ) ATC AGC GAT TCA CCC TTG TTT TAA GGT TAC ACC CAT GTT A 
(SEQ ID NO 36) 

(1 ) ATC AGC GAT TAA CGC TTA TTT TAG CGT TAC ACC CAT GTT A 
(SEQ ID NO 37) 

(1 ) ATC AGC GAT TAA CGC TTG TTT TAG TGT TGC ACC CAT GTT A 
(SEQ ID NO 38) 

(1 ) ATC AGC GAT TAA CGC TTA TTT TAG CAT TAC ACC CAT GTT A 
(SEQ ID NO 39). 

The number in parentheses indicates the number of clones having that particular 
sequence. Nucleotides shown in bold are those that differ compared to the lead 
compound. 

Formal analysis of the cleavage activity of these clones is ongoing. The 
population as a whole exhibits Mg 2 + -dependent cleavage activity at an observed rate of 
- 10* 2 min \ with a comparable level of activity in the presence of Pb 2 + . 

Figures 6A and 6B provide two-dimensional illustrations of a "progenitor" 
catalytic DNA molecule and one of several catalytic DNA molecules obtained via the 
selective amplification methods disclosed herein, respectively. Figure 6A illustrates an 
exemplary molecule from the starting pool, showing the overall configuration of the 
molecules represented by SEQ ID NO 23. As illustrated, various complementary 
nucleotides flank the random (N^) region. 

Figure 6B is a diagrammatic representation of one of the Mg 2 *-dependent 
catalytic DNA molecules (or "DNAzymes") generated via the within-described 
procedures. The location of the ribonucleotide in the substrate nucleic acid is indicated 
via the arrow. (The illustrated molecule includes the sequence identified herein as SEQ 
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ID NO 25, as well as "beginning" and "ending" sequences of SEQ ID NO 23.) 

Endonuclease activity is continuing to be enhanced in each of the 
aforementioned "families" via in vitro evolution, as disclosed herein, so it is anticipated 
that enzymatic DNA molecules of increasingly desirable specificities may be generated 
5 successfully using the within-disclosed guidelines. 

Example 5 
Cleavage of Lamer RNA Sequences 
As an extension of the foregoing, we have developed DNA enzymes that cleave 
an all-RNA substrate, rather than a single ribonucleotide embedded within an otherwise 
10 all-DNA substrate as demonstrated above. (Also see R.R. Breaker & G.F. Joyce, Chem- 

& Biol. 1 : 223-229 (1994); R.R. Breaker & G.F. Joyce, Chem. & Biol. 2: 655-660 
(1995)). As a target sequence, we chose a stretch of 12 highly-conserved nucleotides 
within the U5 LTR region of HIV-1 RNA, having the sequence 
5' GUAACUAGAGAU 3' (SEQ ID NO 49). 
1 5 Following the methods described in the previous examples, we generated a pool 

of 1014 DNA molecules that have the following composition: 

5 - GGAAAA r(GUAACUAGAGAU) GGAAGAGATGGCGAC N 50 . 
CGGTAAGCTTGGCAC -3' (SEQ ID NO 50), 
where N is an equimolar mixture of the deoxyribonucleotides G, A, T, and C, and where 
20 the sequence identified as "r(GUAACUAGAGAU) " is comprised of /vfconucleotides. 

{Optionally, one may alter the initial 5* nucleotide sequence, e.g., by adding an 
additional dA residue to the sequence preceding the ribonucleotide portion at the 5' end, 
thus causing the initial sequence to read "GGAAAAA" and causing SEQ ID NO 50 to be 
99 residues in length. Clearly, this is but one example of the modifications that may be 
25 made in order to engineer specific enzymatic DNA molecules, as disclosed in detail 

herein.) 

The enzymatic DNA molecules thus produced were selected for their ability to 
cleave a phosphoester that lies within the embedded RNA target sequence. Ten rounds 
of in vitro selective amplification were carried out, based on the enzymatic DNA 

30 molecules' activity in the presence of 10 mM Mg 2+ at pH 7.5 and 37*C. During the 

selection process, there was competition for "preferred" cleavage sites as well as for the 
"best" catalyst that cleaves at each such preferred site. Two sites and two families of 
catalysts emerged as possessing the most efficient cleavage capabilities (see Fig. 7). 
Figure 7 illustrates some of the results of ten rounds of in vitro selective 

35 amplification carried out essentially as described herein. As shown, two sites and two 

families of catalysts emerged as displaying the most efficient cleavage of the target 
sequence. Cleavage conditions were essentially as indicated in Fig. 7, namely, 10mM 
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Mg 2 + , pH 7.5, and 37°; data collected after the reaction ran for 2 hours is shown. 
Cleavage {%) is shown plotted against the number of generations (here, 0 through 10). 
The number/prevalence of catalytic DNA molecules capable of cleaving the target 
sequence at the indicated sites in the substrate is illustrated via the vertical bars, with 
cleavage at GlUAACUAGAGAU shown by the striped bars, and with cleavage at 
GUAACUAiGAGAU illustrated via the open (lightly-shaded) bars. In Figure 7, as herein, 
the arrow (I) indicates the site between two neighboring nucleotides at which cleavage 
occurs. 

Various individuals from the population obtained after the 8th and 10th rounds 
of selective amplification were cloned. The nucleotide sequences of 29 individuals from 
the 8th round and 32 individuals from the 10th round were then determined {see Tables 
2 and 3, respectively). 

Under the heading "Nucleotide Sequence" in each of Tables 2 and 3 is shown 
the portion of each identified clone that corresponds to the 50 nucleotides that were 
randomized in the starting pool (i.e., N 50 ); thus, the entire nucleotide sequence of a 
given clone generally includes the nucleotide sequences preceding, following, and 
including the "N 50 " segment, presuming the substrate sequence is attached and that 
self-cleavage has not occurred. For example, the entire sequence of a (non-self-cleaved) 
clone may generally comprise residue nos. 1-33 of SEQ ID NO 50, followed by the 
residues representing the randomized N 50 region, followed by residue nos. 84-98 of SEQ 
ID NO 50, or by residue nos. 1-34 of SEQ ID NO 51, followed by the residues 
representing the randomized N so region, followed by residue nos. 85-99 of SEQ ID NO 
51 . It is believed, however, that the N 50 (or N^) region - or a portion thereof -- of each 
clone is particularly important in determining the specificity and/or activity of a particular 
enzymatic DNA molecule. This is particularly evident in reactions in which the substrate 
and the DNAzyme are separate molecules (see, e.g., Figs. 8 and 9). 

Clone numbers are designated as 8-x or 10-x for individuals obtained after the 
8th or 10th rounds, respectively. SEQ ID NOS are also listed and correspond to the 
"N 50 M region of each clone. 
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Table 2 

Cloned Individuals from 8th Round of Amplification 

Clone SEQ 

5 Nn ID NO "N,. " Nucleotide Sequence f5'-3') 

8-2 52 CCA ATA GTG CTA CTG TGT ATC TCA ATG CTG GAA ACA CGG GTT 
ATC TCC CG 

8-4 53 CCA AAA CAG TGG AGC ATT ATA TCT ACT CCA CAA AGA CCA CTT 
TTC TCC CG 

10 8-5 1 54 ATC CGT ACT AGC ATG CAG ACA GTC TGT CTG CTT TTT CAT TAC 

TCA CTC CC 

8-1 4 55 CAA TTC ATG ATG ACC AAC TCT GTC AAC ACG CGA ACT TTT AAC 
ACT GGC A 

8-1 7 2 56 CTT CCA CCT TCC GAG CCG GAC GAA GTT ACT TTT TAT CAC ACT 
1 5 ACG TAT TG 

8-3 57 GGC AAG AGA TGG CAT ATA TTC AGG TAA CTG TGG AGA TAC CCT 
GTC TGC CA 

8-6 58 CTA GAC CAT TCA CGT TTA CCA AGC TAT GGT AAG AAC TAG AAT 
CAC GCG TA 

20 8-8 59 CGT ACA CGT GGA AAA GCT ATA AGT CAA GTT CTC ATC ATG TAC 

CTG ACC GC 

8-10 60 CAG TGA TAC ATG AGT GCA CCG CTA CGA CTA AGT CTG TAA CTT 
ATT CTA CC 

8-22 61 ACC GAA TTA AAC TAC CGA ATA GTG TGG TTT CTA TGC TTC TTC 

25 TTC CCT GA 

8-11 62 CAG GTA GAT ATA ATG CGT CAC CGT GCT TAC ACT CGT TTT ATT 

AGT ATG TC 

8-21 63 CCC TAC AAC ACC ACT GGG CCC AAT TAG ATT AAC GCT ATT TTA 
TAA CTC G 

30 8-12 64 CCA AAC GGT TAT AAG ACT GAA AAC TCA ATC AAT AGC CCA ATC 

CTC GCC C 

8-13 65 CAC ATG TAT ACC TAA GAA ATT GGT CCC GTA GAC GTC ACA GAC 
TTA CGC CA 

8-23 66 CAC AAC GAA AAC AAT CTT CCT TGG CAT ACT GGG GAG AAA GTC 
35 TGT TGT CC 

8-40 67 CAC ACG AAC ATG TCC ATT AAA TGG CAT TCC GTT TTT CGT TCT 
ACA TAT GC 
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8-24 68 CAG AAC GAG GGT CTT GTA AGA CTA CAC CTC CTC AGT GAC AAT 
AAT CCT G 

8-26 69 CAC TAC AGC CTG ATA TAT ATG AAG AAC AGG CAA CAA GCT TAT 
GCA CTG G 

8-27 70 GGG TAC ATT TAT GAT TCT CTT ATA AAG AGA ATA TCG TAC TCT 
TTT CCC CA 

8-28 71 CCA AAG TAC ATT CCA ACC CCT TAT ACG TGA AAC TTC CAG TAG 
TTT CCT A 

8-29 72 CTT GAA GAT CCT CAT AAG ACG ATT AAA CAA TCC ACT GGA TAT 
AAT CCG GA 

8-34 73 CGA ATA GTG TCC ATG ATT ACA CCA ATA ACT GCC TGC CTA TCA 
TGT TTA TG 

8-35 74 CCA AGA GAG TAT CGG ATA CAC TTG GAA CAT AGC TAA CTC GAA 
CTG TAC CA 

8-36 75 CCA CTG ATA AAT AGG TAA CTG TCT CAT ATC TGC CAA TCA TAT 
GCC GTA 

8-37 76 CCC AAA TTA TAA ACA ATT TAA CAC AAG CAA AAG GAG GTT CAT 
TGC TCC GC 

8-39 77 CAA TAA ACT GGT GCT AAA CCT AAT ACC TTG TAT CCA AGT TAT 
CCT CCC CC 

1 identical to 10-4, 10-40 

2 identical to 8-20, 8-32, 8-38, 10-1, 10-34; 1 mutation to 10-11; 3 mutations 
to 10-29 

Table 3 

Cloned Individuals from 10th Round of Amplification 

Clone SEQ 

MSL IP NO "N~" Nucleotide SfimiPn ce (5'-3'l 

1 0-3 3 78 CCG AAT GAC ATC CGT AGT GGA ACC TTG CTT TTG ACA CTA AGA 
AGC TAC AC 

10-10 79 CCA TAA CAA ATA CCA TAG TAA AGA TCT GCA TTA TAT TAT ATC 
GGT CCA CC 

10-12 80 CAG AAC AAA GAT CAG TAG CTA AAC ATA TGG TAC AAA CAT ACC 
ATC TCG CA 

10-14 81 CCT TTA GTT AGG CTA GCT ACA ACG ATT TTT CCC TGC TTG GCA 
ACG ACA C 
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10-15 82 CTC CCT ACG TTA CAC CAG CGG TAC GAA TTT TCC ACG AGA GGT 
AAT CCG CA 

10-19 83 CGG CAC CTC TAG TTA G AC ACT CCG GAA TTT TTC CCC 

10-39 84 CGG CAC CTC TAG TTA G AC ACT CCG GAA TTT TAG CCT ACC ATA 

GTC CGG T 

1 0-23 85 CCC TTT GGT TAG GCT AGC TAC AAC GAT TTT TCC CTG CTT GAA 
TTG TA 

1 0-27 4 86 CCC TTT GGT TAG GCT AGC TAC AAC GAT TTT TCC CTG CTT GAC 
CTG TTA CGA 

1 0-31 87 CCT TTA GTT AGG CTA GCT ACA ACG ATT TTT CCC TGC TTG GAA 
CGA CAC 

10-1 8 88 CAT GGC TTA ATC ATC CTC AAT AGA AGA CTA CAA GTC GAA TAT 
GTC CCC CC 

10-20 89 CAA CAG AGC GAG TAT CAC CCC CTG TCA ATA GTC GTA TGA AAC 
ATT GGG CC 

10-6 90 TAC CGA CAA GGG GAA TTA AAA GCT AGC TGG TTA TGC AAC CCT 
TTT CGC A 

10-7 91 CTC GAA ACA GTG ATA TTC TGA ACA AAC GGG TAC TAC GTG TTC 
AGC CCC C 

10-8 92 CCA ATA ACG TAA CCC GGT TAG ATA AGC ACT TAG CTA AGA TGT 
TTA TCC TG 

10-16 93 CAA TAC AAT CGG TAC GAA TCC AGA AAC ATA ACG TTG TTT CAG 
AAT GGT CC 

1 0-2 1 94 GCA ACA ACA AGA ACC AAG TTA CAT ACA CGT TCA TCT ATA CTG 
AAC CCC CA 

1 0-24 95 CCT TTG AGT TCC TAA ATG CCG CAC GGT AAG CTT GGC ACA CTT 
TGA CTG TA 

1 0-28 96 CAA AGA TCT CAC TTT GGA AAT GCG AAA TAT GTA TAT TCG CCC 
TGT CTG C 

1 0-33 97 CCA CGT AGA ATT ATC TGA TTT ATA ACA TAA CGC AGG ATA ACT 
CTC GCC CA 

1 0-35 98 CAC AAG AAA GTG TCG TCT CCA GAT ATT TGA GTA CAA GGA ACT 
ACG CCC 

1 0-36 99 CAT GAA GAA ATA GGA CAT TCT ACA GGC TGG ACC GTT ACT ATG 
CCT GTA GG 

1 0-37 1 00 CAT AGG ATA ATC ATG GCG ATG CTT ATG ACG TGT ACA TCT ATA 
CCT T 
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10-38 101 CAG ATG ATC TTC CTT TAA AG A CTA CCC TTT AAA GAA ACA TAA 
GGT ACC CC 

3 1 mutation to 10-5 

4 1 mutation to 10-30 

The self-cleavage activity of various clones was subsequently measured. Clones 
8-5, 8-17, and 10-3 were found to cleave efficiently at the site 5* GUAACUi AGAGAU 
3\ while clones 10-14, 10-19 and 10-27 were found to cleave efficiently at the site 5' 
G I UAACUAGAGAU 3\ When the RNA portion of the molecule was extended to the 
sequence 5* GGAAAAAGUAACUAGAGAUGGAAG 3' (residue nos. 1-24 of SEQ ID NO 
51), clones 8-17, 10-14, and 10-27 retained full activity, while clones 8-5, 10-3, and 
10-19 showed diminished activity. Subsequently, clone 10-23 was found to exhibit a 
high level of activity in the self-cleavage reaction involving the extended RNA domain. 

It should also be noted, in the event one of skill in the relevant art does not 
appreciate same, that the nucleotide sequences preceding and following the "N 50 " 
segments of the polynucleotide molecules engineered according to the teachings of the 
present invention disclosure may be altered in a variety of ways in order to generate 
enzymatic DNA molecules of particular specificities. For example, while residue nos. 1- 
24 of SEQ ID NO 51 are described herein as RNA nucleotides, they may alternatively 
comprise DNA, RNA, or composites thereof. (Thus, for example, SEQ ID NO 51 could 
easily be altered so that nucleic acid residue nos. 1-7 would comprise DNA, residue nos. 
8-19 would comprise RNA, residue nos. 20-99 would comprise DNA, and so on.) 
Similarly, the nucleotides following the "N 50 " region may comprise RNA, DNA, or 
composites thereof. The length of the regions preceding and following the "N 50 n (or 
" N 4o" " see Example 4) region(s) may also be varied, as disclosed herein. Further, 
sequences preceding and/or following N 50 or N^ regions may be shortened, expanded, 
or deleted in their entirety. 

Moreover, as noted above, we selected a specific region of HIV-1 RNA as the 
target sequence in the methods described in this Example; such a sequence is not the 
only sequence one may use as a target. Clearly, one of skill in the relevant art may 
follow our teachings herein to engineer and design enzymatic DNA molecules with 
specificity for other target sequences. As disclosed herein, such target sequences may 
be constructed or inserted into larger sequences comprising DNA, RNA, or composites 
thereof, as illustrated by SEQ ID NOS 50 and 51 . 

The self-cleavage reaction was easily converted to an intermolecular cleavage 
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reaction by dividing the enzyme and substrate domains into separate molecules. Clones 
8-1 7 and 10-23 were chosen as prototype molecules. Both were shown to act as DNA 
enzymes in the cleavage of a separate all-RNA substrate in a reaction that proceeds with 
multiple turnover (Fig. 8). The substrate binding arms were subsequently reduced to 7 
5 base-pairs on each side of the unpaired nucleotide that demarcates the cleavage site 

(Fig. 9). 

Figure 8 illustrates the nucleotide sequences, cleavage sites, and turnover rates 
of two catalytic DNA molecules of the present invention, clones 8-17 and 10-23. 
Reaction conditions were as shown, namely, 10mM Mg 2+ , pH 7.5, and 37°C. The 

10 DNAzyme identified as clone 8-17 is illustrated on the left, with the site of cleavage of 

the RNA substrate indicated by the arrow. The substrate sequence (5* - 
GGAAAAAGUAACUAGAGAUGGAAG - 3') -- which is separate from the DNAzyme (i.e., 
intermolecular cleavage is shown) -- is labeled as such. Similarly, the DNAzyme 
identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 

1 5 substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 

17 enzyme, the turnover rate was approximately 0.6 hr for the 10-23 enzyme, the 
turnover rate was approximately 1 hr\ 

As illustrated in Fig. 8, the nucleotide sequence of the clone 8-17 catalytic DNA 
molecule capable of cleaving a separate substrate molecule was as follows: , 

20 5 • - CTTCC ACCTTCCG AGCCGG ACG AAGTTACTTTTT - 3 ' (residue nos. 1-34 of SEQ ID 

NO 56). In that same figure, the nucleotide sequence of the clone 10-23 catalytic DNA 
molecule capable of cleaving a separate substrate molecule was as follows: 
5' -CTTTGGTTAGGCTAGCTACAACGATTTTTCC-3' (residue nos. 3-33 of SEO ID NO 

85). 

25 Figure 9 further illustrates the nucleotide sequences, cleavage sites, and 

turnover rates of two catalytic DNA molecules of the present invention, clones 8-17 and 
10-23. Reaction conditions were as shown, namely, 10mM Mg J + , pH 7.5, and 37°C. 
As in Fig. 8, the DNAzyme identified as clone 8-17 is illustrated on the left, with the site 
of cleavage of the RNA substrate indicated by the arrow. The substrate sequence (5' - 

30 GGAAAAAGUAACUAGAGAUGGAAG - 3") -which is separate from the DNAzyme (i.e., 

intermolecular cleavage is shown) -- is labeled as such. Similarly, the DNAzyme 
identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 
substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 
1 7 enzyme, k OM was approximately 0.002 min - 1 ; for the 1 0-23 enzyme, the value of k oOS 

35 was approximately 0.01 min' 1 . 

As illustrated in Fig. 9, the nucleotide sequence of the clone 8-17 catalytic DNA 
molecule capable of cleaving a separate substrate molecule was as follows: 
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S'-CCACCTTCCGAGCCGGACGAAGTTACT-S' {residue nos. 4-30 of SEQ ID NO 56). In 
that same figure, the nucleotide sequence of the clone 10-23 catalytic DNA molecule 
capable of cleaving a separate substrate molecule was as follows: 

S'-CTAGTTAGGCTAGCTACAACGATTTTTCC-S" (residue nos. 5-33 of SEQ ID NO 85, 
5 with "CTA" substituted for *TTG" at the 5' end). 

The catalytic rate of the RNA-cleaving DNA enzymes has yet to be fully 
optimized. As disclosed above and as reported in previous studies, we have been able 
to improve the catalytic rate by partially randomizing the prototype molecule and 
carrying out additional rounds of selective amplification. We have found, however, that 
10 the K m for Mg 2+ is approximately 5 mM and 2 mM for the 8-17 and 10-23 DNA 

enzymes, respectively, measured at pH 7.5 and 37°C; this is certainly compatible with 
intracellular conditions. 



The foregoing specification, including the specific embodiments and examples, is 
1 5 intended to be illustrative of the present invention and is not to be taken as limiting. 

Numerous other variations and modifications can be effected without departing from the 
true spirit and scope of the present invention. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(I) APPLICANT: The Scripps Research Institute 
5 (ii) TITLE OF INVENTION: ENZYMATIC DNA MOLECULES 

(iii) NUMBER OF SEQUENCES: 101 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: The Scripps Research Institute 

(B) STREET: 10666 North Torrey Pines Road, TPC-8 
10 (C) CITY: La Jolla 

(D) STATE: California 

(E) COUNTRY: United States 

(F) ZIP: 92037 

15 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE : Patentln Release #1.0, Version #1.25 



20 



25 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US95/ 

(B) FILING DATE: 01-DEC-1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/472,194 

(B) FILING DATE: 07-JUN-1995 



30 (vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/349,023 

(B) FILING DATE: 02-DEC-1994 

(viii) ATTORNEY /AGENT INFORMATION: 
35 (A) NAME: Logan, April C. 

(B) REGISTRATION NUMBER: 33,950 

(C) REFERENCE /DOCKET NUMBER: 463.2 PC 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619) 554-2937 

(B) TELEFAX: (619) 554-6312 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 15 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
CGGTAAGCTT GGCAC 15 
20 ( 2 ) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

30 (ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 

(B) LOCATION: replace (8, »») 

CD) OTHER INFORMATION: /s tandard_name= "ADENOSINE 
RIBONUCLEOTIDE " 

35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
TCACTATNAG GAAGAGATGG 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
ACACATCTCT GAAGTAGCGC CGCCGTATAG TGACGCTA 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



GTGCCAAGCT TACCGNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
NNNNNGTCGC CATCTCTTCC 
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(2) INFORMATION FOR SEQ ID NO:5: 

<i) SEQUENCE CHARACTERISTICS: 
fA) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 28 

15 <D) OTHER INFORMATION: /standard_name= "2' 3* CYCLIC 

PHOSPHATE " 

(ix) FEATURE: 

(A) NAME / KEY : misc_dif f erence 
20 (B) LOCATION: replace (28, " " ) 

(D) OTHER INFORMATION: /s tandard_name= "ADENOSINE 
R I BONUCLEOT I DE " 



25 <xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



GGGACGAATT CTAATACGAC TCACTATN 2 8 



30 



(2) INFORMATION FOR SEQ ID NO : 6 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

< C ) STRANDEDNES S : s ingl e 
35 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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fix) FEATURE: 

(A) NAME /KEY : misc_dif f erence 

(B) LOCATION: replace (28, " " ) 

5 (D) OTHER INFORMATION: /standard_name= "ADENOSINE 

RIBONUCLEOTIDE " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

10 GGGACGAATT CTAATACGAC TCACTATN 



(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

20 (ii) MOLECULE TYPE: DNA (genomic) 



FEATURE : 

(A) NAME /KEY : misc_diff erence 

(B) LOCATION: replace (8, " " ) 

(D) OTHER INFORMATION: /standard_name= " ADENOS 
RIBONUCLEOTIDE " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
30 TCACTATNGG AAGAGATGG 



(ix) 



25 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

BN8DOCHD- *WO flfi17DflfiA1 I > 



WO 96/17086 



PCT/US95/15580 



-60- 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 
<B) LOCATION: replace (8, " " } 

(D) OTHER INFORMATION: /s tandard_name= "ADENOSINE 
NUCLEOTIDE" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

TCACTATN 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
CCATCTCTTC CTATAGTGAG TCCGGCTGCA 
£2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GTGCCAAGCT TACCG 15 
5 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CTGCAGAATT CTAATACGAC TCACTATAGG AAGAGATGGC GAC 4 3 

(2) INFORMATION FOR SEQ ID NO: 12: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 

25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix> FEATURE: 
30 (A) NAME / KEY : misc_dif f erence 

(B) LOCATION: replace (8, ) 

(D) OTHER INFORMATION: / standard_name= "ADENOSINE 
RIBONUCLEOTIDE" 



35 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:12: 
TCACTATNGG AAGAGATGG 



19 



WO 96/17086 ^0 PCT/US95/15580 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



10 



(ix) FEATURE: 

(A) NAME /KEY : misc_di f f erence 

(B) LOCATION: replace (28, " " ) 

(DJ OTHER INFORMATION: /standard_name= "ADENOSINE 
1 5 RIBONUCLEOTIDE " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GGGACGAATT CTAATACGAC TCACTATNGG AAGAGATGGC GAC 4 3 

20 

(2) INFORMATION FOR SEQ ID NO:14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 50 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TCACACATCT CTGAAG TAG C GCCGCCGTAT GTGACGCTAG GGGTTCGCCT 50 
35 (2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGGGGGAACG CCGTAACAAG CTCTGAACTA GCGGTTGCGA TATAGTCGTA 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 
CGGGACTCCG TAGCCCATTG CTTTTTGCAG CGTCAACGAA TAG CGTATT A 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



CCACCATGTC TTCTCGAGCC GAACCGATAG TTACGTCATA CCTCCCGTAT 
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(2) INFORMATION FOR SEQ ID NO: 18: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 
GCCAGATTGC TGCTACCAGC GGTACGAAAT AGTGAAGTGT TCGTGACTAT 5 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
ATAGGCCATG CTTTGG CT AG CGGCACCGTA TAGTGTACCT GCCCTTATCG 5 
(2) INFORMATION FOR SEQ ID NO: 20: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TCTGCTCTCC TCTATTCTAG CAGTGCAGCG AAATATGTCG AATAGTCGGT 50 
5 (2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
15 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 21 > 

TTGCCCAGCA TAGTCGGCAG ACGTGGTGTT AGCGACACGA TAGGCCCGGT 
(2) INFORMATION FOR SEQ ID NO: 22: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

TTGCTAGCTC GGCTGAACTT CTGTAGCGCA ACCGAAATAG TGAGGCTTGA 
(2) INFORMATION FOR SEQ ID NO: 23: 



30 



50 



50 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 



WO 96/17086 




PCT/US95/15580 



-66- 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 

(B) LOCATION: replace (28, " " ) 

(D) OTHER INFORMATION: /standard_name= "ADENOSINE 
1 0 RIBONUCLEOTIDE " 

/label- rA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

15 GGGACGAATT CTAATACGAC TCACTATNGG AAGAGATGGC GACATCTCNN NNNNNNNNNN 6 0 

NNNNNNNNNN NNNNNNNNNN NNNNNNNNGT GACGGTAAGC TTGGCAC 10 7 

(2) INFORMATION FOR SEQ ID NO: 24: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

30 CCGCCCACCT CTTTTACGAG CCTGTACGAA ATAGTGCTCT TGTTAGTAT 4 9 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY; linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

TCTCTTCAGC GATGCACGCT TGTTTTAATG TTGCACCCAT GTTAGTGA 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
TCTCATCAGC GATTGAACCA CTTGGTGGAC AGACCCATGT TAGTGA 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CCGCCCACCT CTTTTACGAG CCTGTACGAA ATAGTGTTCT TGTTAGTAT 



(2) INFORMATION FOR SEQ ID NO:28: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
CCGCCCACCT CTTTTACGAG CCTGTACGAA ATAGTGCTCT CGTTAGTAT 4 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
TCTCAGACTT AGTCCATCAC ACTCTGTGCA TATGCCTGCT TGATGTGA 4 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 30: 
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CTCTCATCTG CTAGCACGCT CGAATAGTGT CAGTCGATGT GA 4 2 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 1 : 

15 T AC AG C G ATT CACCCTTGTT TAAGGGTTAC ACCCATGTTA 4 9 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY : 1 inear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

ATCAGCGATT AACGCTTGTT TCAATGTTAC ACCCATGTTA 40 

(2) INFORMATION FOR SEQ ID NO: 33: 



30 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 0 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

5 TTC AG CG ATT AACGCTTATT TTAGCGTTAC ACCCATGTTA 4 0 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

15 (ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



20 



30 



ATCAG CG ATT CACCCTTGTT TTAAGGTTGC ACCCATGTTA 4 0 



(2) INFORMATION FOR SEQ ID NO: 35; 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 0 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inea r 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ATCAGCGATT CACCCTTGTT TAAGCGTTAC AC C C ATGTTG 4 0 

35 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



10 ATCAGCGATT CACCCTTGTT TTAAGGTTAC ACCCATGTTA 



40 



(2) INFORMATION FOR SEQ ID NO: 37: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA (genomic) 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
ATCAGCGATT AACGCTTATT TTAGCGTTAC ACCCATGTTA 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 



35 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 
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ATCAG CGATT AACGCTTGTT TTAGTGTTGC ACCCATGTTA 
(2) INFORMATION FOR SEQ ID NO: 39: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

1 5 ATCAGCGATT AACGCTTATT TT AG C ATT AC ACCCATGTTA 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 10 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GCCATGCTTT 

30 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



PCT/US95/15580 



40 



40 



20 



30 



TATGTGACGC TA 



(2) INFORMATION FOR SEQ ID NO: 43 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 



TATAGTCGTA 



35 (2) INFORMATION FOR SEQ ID N0:44: 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

5 CTCTATTTCT 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



10 



12 



10 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
ATAG CGTATT A 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 
ATAGTTACGT CAT 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
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AATAGTGAAG TGTT 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
< D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
15 ATAGGCCCGG T 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48 
AATAGTGAGG CTTG 

(2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



14 



11 



14 



BN80OCID: <WO 9617066A1 I > 



WO 96/17086 ^ PCT/US95/15580 

-76- 

(ii) MOLECULE TYPE: RNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GUAACUAGAG AU 12 
(2) INFORMATION FOR SEQ ID NO: 50: 



10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 
20 (ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 7 . . 18 

(D) OTHER INFORMATION: /note= "Position 7-18 is RNA; the 

remainder of the sequence is DNA. " 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GGAAAAGUAA CUAGAGAUGG AAGAGATGGC GACNNNNNNN NNNNNNNNNN NNNNNNNNNN 6 0 
NNNNNNNNNN NNNNNNNNNN NNNCGGTAAG CTTGGCAC 98 

(2) INFORMATION FOR SEQ ID NO: 51: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 99 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
(ix) FEATURE: 

(A) NAME / KEY : misc_feature 

(B) LOCATION: 1. .24 

(D) OTHER INFORMATION: /note= "Positions 1-24 is RNA; the 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GGAAAAAGUA ACUAGAGAUG GAAGAGATGG CGACNNNNNN NNNNNNNNNN NNNNNNNNNN 6 0 
NNNNNNNNNN NNNNNNNNNN NNNNCGGTAA GCTTGGCAC 99 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 



(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
CCAATAGTGC TACTGTGTAT CTCAATGCTG GAAACACGGG TTATCTCCCG 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 



remainder of the sequence is DNA." 



(C) STRANDEDNESS : single 



( D ) TOPOLOGY : 1 inear 



(C) STRANDEDNESS : single 



(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

CCAAAACAGT GGAGCATTAT ATCTACTCCA CAAAGACCAC TTTTCTCCCG 50 

(2) INFORMATION FOR SEQ ID NO: 54: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : single 

15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



20 

(Xi) SEQUENCE DESCRIPTION: 'SEQ ID NO: 54: 
ATCCGTACTA GCATGCAGAC AGTCTGTCTG CTTTTTCATT ACTCACTCCC 5 0 

25 (2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

CAATTCATGA TGACCAACTC TGTCAACACG CGAACTTTTA ACACTGGCA 4 9 
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(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

15 CTTCCACCTT CCGAGCCGGA CGAAGTTACT TTTTATCACA CTACGTATTG 5 0 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s ing 1 e 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI - SENSE : NO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 : 
GGCAAGAGAT GG CAT AT ATT CAGGTAACTG TGG AG AT AC C CTGTCTGCCA 50 
(2) INFORMATION FOR SEQ ID NO:58: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:58: 

10 CTAGACCATT CACGTTTACC AAGCTATGGT AAGAACTAGA ATCACGCGTA 50 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS : 
15 (A) LENGTH: 50 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



25 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



CGTACACGTG GAAAAGCTAT AAGTCAAGTT CTCATCATGT ACCTGACCGC 5 0 



(2) INFORMATION FOR SEQ ID NO: 60: 

30 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



WO 96/17086 




CTYUS95/15580 



-81- 

(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

5 CAGTGATACA TGAGTGCACC GCTACGACTA AGTCTGTAAC TTATTCTACC 50 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: 
ACCGAATTAA ACT AC CGAAT AGTGTGGTTT CTATGCTTCT TCTTCCCTGA -5 0 

(2) INFORMATION FOR SEQ ID NO: 62: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 



CAGGTAGATA TAATGCGTCA CCGTGCTTAC ACTCGTTTTA TTAGTATGTC 



50 
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(2) INFORMATION FOR SEQ ID NO:63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

1 5 CCCTACAACA CCACTGGGCC CAATTAGATT AACGCTATTT TATAACTCG 4 9 

(2) INFORMATION FOR SEQ ID NO: 64: 

<i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



CCAAACGGTT ATAAGACTGA AAACTCAATC AATAGCCCAA TCCTCGCCC 4 9 



(2) INFORMATION FOR SEQ ID NO: 65: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 

10 CACATGTATA CCTAAGAAAT TGGTCCCGTA GACGTCACAG ACTTACGCCA 50 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 

25 

CACAACGAAA ACAATCTTCC TTGG CATACT GGGGAGAAAG TCTGTTGTCC 50 
(2) INFORMATION FOR SEQ ID NO: 67: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

5 CACACGAACA TGTCCATTAA ATGGCATTCC GTTTTTCGTT CTACATATGC 50 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

20 

CAGAACGAGG GTCTTGTAAG ACTACACCTC CTCAGTGACA ATAATCCTG 4 9 

(2) INFORMATION FOR SEQ ID NO: 69: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 9 : 



CACTACAGCC TGATATATAT GAAGAACAGG CAACAAGCTT ATGCACTGG 



49 



• ft, 
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(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 50 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
-|0 (iii) HYPOTHETICAL: NO 

<iv) ANT I- SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
15 GGGTACATTT ATGATTCTCT TATAAAGAGA ATATCGTACT CTTTTCCCCA SO 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: - . 

20 (A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:71: 
C CAAAGTAC A TTCCAACCCC TTATACGTGA AACTTCCAGT AGTTTCCTA <9 
(2) INFORMATION FOR SEQ ID NO: 72: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 72: 

10 CTTGAAGATC CTCATAAGAC GATTAAACAA TCCACTGGAT ATAATCCGGA 50 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANT I- SENSE: NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
CGAATAGTGT CCATGATTAC ACCAATAACT GCCTGCCTAT CATGTTTATG 5 0 

(2) INFORMATION FOR SEQ ID NO: 74: 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
CCAAGAGAGT ATCGGATACA CTTGGAACAT AGCTAACTCG AACTGTACCA 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI- SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
CCACTGATAA ATAGGTAACT GTCTCATATC TG C CAAT CAT ATGCCGTA 
(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 



CCCAAATTAT AAACAATTTA ACACAAGCAA AAGGAGGTTC ATTGCTCCGC 
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(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
CAATAAACTG GTGCTAAACC TAATACCTTG TATCCAAGTT ATCCTCCCCC 5 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
CCGAATGACA TCCGTAGTGG AACCTTGCTT TTGACACTAA GAAGCTACAC 5 
(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GGGGGGAACG CCGTAACAAG CTCTGAACTA GCGGTTGCGA TATAGTCGTA 5 0 

(2) INFORMATION FOR SEQ ID NO : 16 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l6: 
CGGGACTCCG TAGCCCATTG CTTTTTGCAG CGTCAACGAA TAGCGTATTA 5 0 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



CCACCATGTC TTCTCGAGCC GAACCGATAG TTACGTCATA CCTCCCGTAT 



50 



10 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GCCAGATTGC TGCTACCAGC GGTACGAAAT AGTGAAGTGT TCGTGACTAT 5 0 

15 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 50 base pairs 
<B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATAGGCCATG CTTTGGCTAG CGGCACCGTA TAGTGTACCT GCC CTTATCG 5 0 



30 



(2) INFORMATION FOR SEQ ID NO: 20: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
TCTGCTCTCC TCTATTCTAG CAGTGCAGCG AAATATGTCG AATAGTCGGT 50 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
TTG CCCAGCA TAGTCGGCAG ACGTGGTGTT AGCGACACGA TAGGCCCGGT * 50 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
TTGCTAGCTC GGCTGAACTT CTGTAGCGCA ACCGAAATAG TGAGGCTTGA 50 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 7 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 

(B) LOCATION: replace (28, "") 

(D) OTHER INFORMATION: /standard_name* "ADENOSINE 
10 RIBONUCLEOTIDE" 

/label- rA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

15 GGGACGAATT CTAATACGAC TCACTATNGG AAGAGATGGC GACATCTCNN NNNNNNNNNN 6 0 

NNNNNNNNNN NNNNNNNNNN NNNNNNNNGT GACGGTAAGC TTGGCAC 10 7 

(2) INFORMATION FOR SEQ ID NO: 24: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

30 CCGCCCACCT CTTTTACGAG CCTGTACGAA ATAGTGCTCT TGTTAGTAT 4 9 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 4 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TCTCTTCAGC GATGCACGCT TGTTTTAATG TTGCACCCAT GTTAGTGA 48 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 6 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 26: 
TCTCATCAGC GATTGAACCA CTTGGTGGAC AGACCCATGT TAGTGA 46 
(2) INFORMATION FOR SEQ ID NO; 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
CCGCCCACCT CTTTTACGAG CCTGTACGAA ATAGTGTTCT TGTTAGTAT 4 9 

(2) INFORMATION FOR SEQ ID NO: 28: 



10 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
CCGCCCACCT CTTTTACGAG CCTGTACGAA ATAGTGCTCT CGTTAGTAT 4 9 

(2) INFORMATION FOR SEQ ID NO: 29: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

25 TCTCAGACTT AGTCCATCAC ACTCTGTGCA TATGCCTGCT TGATGTGA 4 8 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 
30 (A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
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CTCTCATCTG CTAGCACGCT CGAATAGTGT CAGTCGATGT GA 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

TACAGCGATT CACCCTTGTT TAAGGGTTAC AC C CATGTTA 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

AT CAG C G ATT AACGCTTGTT TCAATGTTAC ACCCATGTTA 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

5 TTCAGCGATT AACGCTTATT TTAGCGTTAC ACCCATGTTA 4 0 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

ATCAGCGATT CACCCTTGTT TTAAGGTTGC ACCCATGTTA 4 0 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 35 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
ATCAGCGATT CACCCTTGTT TAAGCGTTAC ACCCATGTTG 4 0 

35 (2) INFORMATION FOR SEQ ID NO:36: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
ATCAGCGATT CACCCTTGTT TTAAGGTTAC ACCCATGTTA 4 0 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
ATCAGCGATT AACGCTTATT TTAGCGTTAC ACCCATGTTA 40 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
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ATCAGCGATT AACG CTTGTT TTAGTGTTGC ACCCATGTTA 



40 



10 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



15 



ATCAGCGATT AACG CTTATT TTAGCATTAC ACCCATGTTA 



40 



(2) INFORMATION FOR SEQ ID NO: 40: 



20 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 



(ii) MOLECULE TYPE: DNA (genomic) 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40 



GCCATGCTTT 



(2) INFORMATION FOR SEQ ID NO: 41: 



10 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
CTCTATTTCT 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
TATGTGACGC TA 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
TATAGTCGTA 

(2) INFORMATION FOR SEQ ID NO: 44: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

10 ATAGCGTATT A 11 

(2) INFORMATION FOR SEQ ID NO: 45: 

<i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 



25 



35 



ATAGTTACGT CAT 



(2) INFORMATION FOR SEQ ID NO: 46: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 



13 
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AATAGTGAAG TGTT 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47; 

15 ATAGGCCCGG T 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



AATAGTGAGG CTTG 



(2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: RNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE : NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GUAACUAGAG AU 12 
(2) INFORMATION FOR SEQ ID NO: 50: 



10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9B base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
20 (ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 7 . .18 

(D) OTHER INFORMATION: /note* "Position 7-18 is RNA; the 

remainder of the sequence is DNA. " 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GGAAAAGUAA CUAGAGAUGG AAGAGATGGC GACNNNNNNN NNNNNNNNNN NNNNNNNNNN 6 0 

NNNNNNNNNN NNNNNNNNNN NNNCGGTAAG CTTGGCAC 98 

(2) INFORMATION FOR SEQ ID NO: 51: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 9 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 
(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..24 

(D) OTHER INFORMATION: /note= "Positions 1-24 is RNA; the 
remainder of the sequence is DNA." 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GGAAAAAGUA ACUAGAGAUG GAAGAGATGG CGACNNNNNN NNNNNNNNNN NNNNNNNNNN 60 
NNNNNNNNNN NNNNNNNNNN NNNNCGGTAA GCTTGGCAC 99 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(2) INFORMATION FOR SEQ ID NO: 53: 



50 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: 
CCAATAGTGC TACTGTGTAT CTCAATGCTG GAAACACGGG TTATCTCCCG 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: 
CCAAAACAGT GGAGCATTAT ATCTACTCCA CAAAGACCAC TTTTCTCCCG 50 
(2) INFORMATION FOR SEQ ID NO: 54: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
ATCCGTACTA GCATGCAGAC AGTCTGTCTG CTTTTTCATT ACTCACTCCC 50 
25 (2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



CAATTCATGA TGACCAACTC TGTCAACACG CGAACTTTTA ACACTGGCA 



49 
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(2) INFORMATION FOR SEQ ID*NO:56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL : NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 56: 

15 CTTCCACCTT CCGAGCCGGA CGAAGTTACT TTTTATCACA CTACGTATTG 50 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
GGCAAGAGAT GG CAT AT ATT CAGGTAACTG TGGAGATACC CTGTCTGCCA 50 
(2) INFORMATION FOR SEQ ID NO: 58: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

10 CTAGACCATT CACGTTTACC AAGCTATGGT AAGAACTAGA ATCACGCGTA 50 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

<B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
( D ) TOPOLOGY : 1 inear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

25 

CGTACACGTG GAAAAGCTAT AAGTCAAGTT CTCATCATGT ACCTGACCGC 50 
(2) INFORMATION FOR SEQ ID NO: 60: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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<iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:60: 

5 CAGTGATACA TGAGTGCACC G CTACG ACTA AGTCTGTAAC TTATTCTACC 50 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

20 

ACCGAATTAA ACTACCGAAT AGTGTGGTTT CTATGCTTCT TCTTCCCTGA 50 



(2) INFORMATION FOR SEQ ID NO: 62: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: 

CAGGTAGATA TAATGCGTCA CCGTGCTTAC ACTCGTTTTA TTAGTATGTC 5 0 
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(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 
5 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

15 CCCTACAACA CCACTGGGCC CAATTAGATT AACGCTATTT TATAACTCG 4 9 

(2) INFORMATION FOR SEQ ID NO: 64: 

<i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
CCAAACGGTT ATAAGACTGA AAACTCAATC AATAGCCCAA TCCTCGCCC 4 9 

(2) INFORMATION FOR SEQ ID NO: 65: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 

10 CACATGTATA CCTAAGAAAT TGGTCCCGTA GACGTCACAG ACTTACGCCA 50 

(2) INFORMATION FOR SEQ ID NO:66: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
CACAACGAAA ACAATCTTCC TTGGCATACT GGGGAGAAAG TCTGTTGTCC 50 
(2) INFORMATION FOR SEQ ID NO: 67: 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: 

5 CACACGAACA TGTCCATTAA ATGGCATTCC GTTTTTCGTT CTACATATGC 50 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 <ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: 
CAGAACGAGG GTCTTGTAAG ACTACACCTC CTCAG TGACA ATAATCCTG 4 9 

(2) INFORMATION FOR SEQ ID NO: 69: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

35 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

CACTACAGCC TGATATATAT GAAGAACAGG CAACAAGCTT ATGCACTGG 4 9 
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(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 50 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

15 GGGTACATTT ATGATTCTCT TATAAAGAGA ATATCGTACT CTTTTCCCCA 50 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 
20 <A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
C CAAAGTAC A TTCCAACCCC TTATACGTGA AACTTCCAGT AGTTTCCTA 4 9 

(2) INFORMATION FOR SEQ ID NO: 72: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

10 CTTGAAGATC CTCATAAGAC GATTAAACAA TCCACTGGAT ATAATCCGGA 50 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

(-B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
CGAATAGTGT CCATGATTAC ACCAATAACT GCCTGCCTAT CATGTTTATG 50 
(2) INFORMATION FOR SEQ ID NO : 74 : 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANT I -SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

5 CCAAGAGAGT ATCGGATACA CTTGGAACAT AGCTAACTCG AACTGTACCA 50 

(2) INFORMATION FOR SEQ ID NO: 75: 

<i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 4 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 ine ar 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
CCACTGATAA ATAGGTAACT GTCTCATATC TGCCAATCAT ATGCCGTA 4 8 

(2) INFORMATION FOR SEQ ID NO : 76 : 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

CCCAAATTAT AAACAATTTA ACACAAGCAA AAGGAGGTTC ATTGCTCCGC 50 
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(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANT I - SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

15 CAATAAACTG GTGCTAAACC TAATACCTTG TATC CAAGTT ATCCTCCCCC 50 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
CCGAATGACA TCCGTAGTGG AACCTTGCTT TTGACACTAA GAAGCTACAC 50 
(2) INFORMATION FOR SEQ ID NO: 79: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAND&DNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

10 CCATAACAAA TACCATAGTA AAGATCTGCA TTATATTATA TCGGTCCACC 50 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE : DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANT I - SENSE : NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
CAGAACAAAG ATCAGT AG CT AAACATATGG TACAAACATA CCATCTCGCA 50 
(2) INFORMATION FOR SEQ ID NO: 81: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

5 CCTTTAGTTA GGCTAGCTAC AACGATTTTT CCCTGCTTGG CAACGACAC 4 9 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 



CTCCCTACGT TACACCAGCG GTACGAATTT TCCACGAGAG GTAATCCGCA 50 



(2) INFORMATION FOR SEQ ID NO: 83: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



CGGCACCTCT AGTTAGACAC TCCGGAATTT TTCCCC 



36 
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(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 9 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 

15 CGGCACCTCT AGTTAGACAC TCCGGAATTT TAGCCTACCA TAGTCCGGT 4 9 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 4 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 



CCCTTTGGTT AGGCTAGCTA CAACGATTTT TCCCTGCTTG AATTGTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 86: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 
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<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

10 CCCTTTGGTT AGGCTAGCTA CAACGATTTT TCCCTGCTTG ACCTGTTACG A 51 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 
CCTTTAGTTA GGCTAGCTAC AACGATTTTT CCCTGCTTGG AACGACAC 4 8 

(2) INFORMATION FOR SEQ ID NO:88: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO:88: 



CATGG CTTAA TCATCCTCAA TAGAAGACTA CAAGTCGAAT ATGTCCCCCC 



50 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
CAACAG AG CG AGTATCACCC CCTGTCAATA GTCGTATGAA ACATTGGGCC 5 0 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 



(ii) 



MOLECULE TYPE: DNA (genomic) 



(iii) 



HYPOTHET I CAL : NO 



(iv) 



ANTI -SENSE: NO 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 90: 



TAC CGACAAG GG G AATT AAA AG CTAGCTGG TTATGCAACC CTTTTCGCA 
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(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

1 5 CTCGAAACAG TGATATTCTG AACAAACGGG TACTACGTGT TCAGCCCCC 4 9 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
CCAATAACGT AACCCGGTTA GATAAGCACT TAGCTAAGAT GTTTATCCTG 5 0 

(2) INFORMATION FOR SEQ ID NO: 93: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: 

CAATACAATC GGTACGAATC CAGAAACATA ACGTTGTTTC AGAATGGTCC 5 

(2) INFORMATION FOR SEQ ID NO : 94 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 50 base pairs 
.(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
GCAACAACAA GAACCAAGTT ACATACACGT TCATC TAT AC TGAACCCCCA 5 
(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANT I - SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:95: 

5 CCTTTGAGTT CCTAAATGCC GCACGGTAAG CTTGGCACAC TTTGACTGTA 5 0 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

20 

CAAAGATCTC ACTTTGGAAA TGCGAAATAT GTATATTCGC CCTGTCTGC 4 9 



(2) INFORMATION FOR SEQ ID NO: 97: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

CCACGTAGAA TTATCTGATT TATAACATAA CGCAGGATAA CTCTCGCCCA 5 0 
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(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 base pairs 
5 (B> TYPE: nucleic acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: NO 

(iv) ANT I - SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

15 CACAAGAAAG TGTCGTCTCC AGATATTTGA GTACAAGGAA CTACGCCC 4 8 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
CATGAAGAAA TAGGACATTC TACAGGCTGG ACCGTTACTA TGCCTGTAGG 50 
(2) INFORMATION FOR SEQ ID NO: 100: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANT I -SENSE : NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

10 CATAGGATAA TCATGGCGAT GCTTATGACG TGTACATCTA TACCTT 4 6 

(2) INFORMATION FOR SEQ ID NO: 101: 

<i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
CAGATGATCT TCCTTTAAAG ACTACCCTTT AAAGAAACAT AAGGTACCCC 
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We Claim: 

1 . A catalytic DNA molecule having site-specific endonuclease activity. 

2. The catalytic DNA molecule of claim 1 , wherein said endonuclease 
activity is specific for a nucleotide sequence defining a cleavage site comprising single- 

5 stranded nucleic acid in a substrate nucleic acid sequence. 

3. The catalytic DNA molecule of claim 2, wherein said single stranded 
nucleic acid comprises RNA, DNA, modified RNA, modified DNA, nucleotide analogs, or 
composites thereof. 

4. The catalytic DNA molecule of claim 2, wherein said substrate nucleic 
10 acid comprises RNA, DNA, modified RNA, modified DNA, nucleotide analogs, or 

composites thereof. 

5. The catalytic DNA molecule of claim 2, wherein said endonuclease 
activity comprises hydrolytic cleavage of a phosphoester bond at said cleavage site. 

6. The catalytic DNA molecule of claim 1, wherein said molecule is single- 
15 stranded. 

7. The catalytic DNA molecule of claim 1, wherein said molecule includes 
one or more hairpin loop structures. 

8. The catalytic DNA molecule of claim 1, wherein said substrate nucleic 
acid sequence is attached to said catalytic DNA molecule. 

20 9. The catalytic DNA molecule of claim 1 , wherein said substrate nucleic 

acid sequence is not attached to said catalytic DNA molecule. 

10. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule comprises a nucleotide sequence selected from the group consisting of: 
SEQ ID NO 3 and SEQ ID NOS 14 through 22. 
25 11. The catalytic DNA molecule of claim 1 , wherein said catalytic DNA 

molecule comprises a nucleotide sequence selected from the group consisting of: 
SEQ ID NOS 23 through 30. 

12. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule comprises a nucleotide sequence selected from the group consisting of: 

30 SEQ ID NOS 31 through 39. 

13. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule comprises a nucleotide sequence selected from the group consisting of: 

SEQ ID NOS 52 through 101. 

14. The catalytic DNA molecule of claim 1 1, 12, or 13, wherein said 
35 endonuclease activity is enhanced by the presence of Mg 2 *. 

15. The catalytic DNA molecule of claim 1 , wherein said catalytic DNA 
molecule has a substrate binding affinity of about 1 //M or less. 
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16. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule binds substrate with a K D of less than about 0.1 jjM. 

17. The catalytic DNA molecule of claim 2, wherein said nucleotide 
sequence defining said cleavage site comprises at least one nucleotide. 

5 18. The catalytic DNA molecule of claim 1, wherein said endonuclease 

activity is enhanced by the presence of a divalent cation. 

19. The catalytic DNA molecule of claim 18, wherein said divalent cation is 
selected from the group consisting of Pb 2 + , Mg 2+ , Mn 2 + , Zn 2 *, and Ca 2+ . 

20. The catalytic DNA molecule of claim 1, wherein said endonuclease 
10 activity is enhanced by the presence of a monovalent cation. 

21 . The catalytic DNA molecule of claim 20, wherein said monovalent cation 
is selected from the group consisting of Na + and K*. 

22. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule comprises a conserved core flanked by first and second substrate binding 

15 regions. 

23. The catalytic DNA molecule of claim 22, further comprising one or more 
spacer nucleotides between said conserved core and said substrate binding region. 

24. The catalytic DNA molecule of claim 22, wherein said conserved core 
comprises one or more conserved regions. 

20 25. The catalytic DNA molecule of claim 24, wherein said one or more 

conserved regions includes a nucleotide sequence selected from the group consisting of: 
CG; 
CGA; 
AGCG; 

25 AGCCG; 

CAGCGAT; 
CTTGTTT; and 
CTTATTT. 

26. The catalytic DNA molecule of claim 24, further comprising one or more 
30 variable or spacer nucleotides between said conserved regions in said conserved core. 

27. The catalytic DNA molecule of claim 22, wherein said first substrate 
binding region includes a nucleotide sequence selected from the group consisting of: 

CATCTCT; 
GCTCT; 

35 TTGCTTTTT; 

TGTCTTCTC; 
TTGCTGCT; 
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GCCATGCTTT; 
CTCTATTTCT 
GTCGGCA; 
CATCTCTTC; and 
5 ACTTCT. 

28. The catalytic ONA molecule of claim 22, wherein said second substrate 
binding region includes a nucleotide sequence selected from the group consisting of: 

TATGTGACGCTA; 

TATAGTCGTA; 
1 0 AT A G C GT ATT A ; 

ATAGTTACGTCAT; 

AATAGTGAAGTGTT; 

TATAGTGTA; 

ATAGTCGGT; 
1 5 ATAGGCCCGGT; 

AATAGTGAGGCTTG; and 

ATGNTG. 

29. The catalytic DNA molecule of claim 22, further comprising a third 
substrate binding region, wherein said third region includes a nucleotide sequence 

20 selected from the group consisting of: 

TGTT; 

TGTTA; and 
TGTT AG. 

30. The catalytic DNA molecule of claim 29, further comprising one or more 
25 spacer regions between said substrate binding regions. 

31 . A composition comprising two or more populations of catalytic DNA 
molecules according to claim 1, wherein each population of catalytic DNA molecules is 
capable of cleaving a different nucleotide sequence in a substrate. 

32. A composition comprising two or more populations of catalytic DNA 
30 molecules according to claim 1 , wherein each population of catalytic DNA molecules is 

capable of recognizing a different substrate. 

33. A method of selecting a catalytic DNA molecule that cleaves a substrate 
nucleic acid sequence at a specific site, comprising the following steps: 

a. obtaining a population of single-stranded DNA molecules; 
35 b. admixing nucleotide-containing substrate molecules with said population 

of single-stranded DNA molecules to form an admixture; 
c. maintaining said admixture for a sufficient period of time and under 
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predetermined reaction conditions to allow single-stranded DNA molecules in 
said population to cause cleavage of said substrate sequences, thereby 
producing substrate cleavage products; 

d. separating said population of single-stranded DNA molecules from said 
5 substrate sequences and substrate cleavage products; and 

e. isolating single-stranded DNA molecules that cleave nucleotide- 
containing substrate at a specific site from said population. 

34. The method of claim 33, wherein said substrate comprises RNA. 

35. The method of claim 33, wherein said DNA molecules that cleave said 
10 substrate at a specific site are tagged with an immobilizing agent. 

36. The method of claim 35, wherein said agent comprises biotin. 

37. The method of claim 35, wherein said isolating step further comprises 
exposing said tagged DNA molecules to a solid surface having avidin linked thereto, 
whereby said tagged DNA molecules become attached to said solid surface. 

15 38. A method of cleaving a phosphoester bond, comprising: 

a. admixing a catalytic DNA molecule capable of cleaving a substrate 
nucleic acid sequence at a defined cleavage site with a phosphoester bond- 
containing substrate, to form a reaction admixture; and 

b. maintaining said admixture under predetermined reaction conditions to 
20 allow said catalytic DNA molecule to cleave said phosphoester bond, thereby 

producing a population of substrate products. 

39. The method of claim 38, further comprising the steps of 

a. separating said products from said catalytic DNA molecule; and 

b. adding additional substrate to said catalytic DNA molecule to form a new 
25 reaction admixture. 

40. The method of claim 38, wherein said substrate comprises RNA. 

41 . The method of claim 38, wherein said predetermined reaction conditions 
include the presence of a monovalent cation, a divalent cation, or both. 

42. A method of engineering catalytic DNA molecules that cleave 
30 phosphoester bonds, comprising the following steps: 

a. obtaining a population of single-stranded DNA molecules; 

b. introducing genetic variation into said population to produce a variant 
population; 

c. selecting individuals from said variant population that meet 
35 predetermined selection criteria; 

d. separating said selected individuals from the remainder of said variant 
population; and 
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e. 



amplifying said selected individuals. 



10 



15 



43. A non-naturally-occurring catalytic DNA molecule comprising a 
nucleotide sequence defining a conserved core flanked by one or more recognition 
domains, variable regions, and spacer regions. 

44. The catalytic DNA molecule of claim 43, wherein said nucleotide 
sequence defines a first variable region contiguous or adjacent to the 5'-terminus of the 
molecule, a first recognition domain located 3'-terminal to the first variable region, a first 
spacer region located 3'-terminal to the first recognition domain, a first conserved region 
located 3'-terminal to the first spacer region, a second spacer region located 3*-terminal 
to the first conserved region, a second conserved region located 3* -terminal to the 
second spacer region, a second recognition domain located 3'-terminal to the second 
conserved region, and a second variable region located 3'-terminal to the second 
recognition domain. 

45. The catalytic DNA molecule of claim 43, wherein said nucleotide 
sequence defines a first variable region contiguous or adjacent to the 5*-terminus of the 
molecule, a first recognition domain located 3*-terminal to the first variable region, a first 
spacer region located 3'-terminal to the first recognition domain, a first conserved region 
located 3'-terminal to the first spacer region, a second spacer region located 3*-terminal 
to the first conserved region, a second conserved region located 3'-terminal to the 
second spacer region, a second recognition domain located 3'-terminal to the second 
conserved region, a second variable region located 3*-terminal to the second recognition 
domain, and a third recognition domain located 3'-terminal to the second variable region. 
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(C) STRAND&DNESS : single 



(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
C C AT AAC AAA TACCATAGTA AAGATCTGCA TTATATTATA TCGGTCCACC 
(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 80: 
CAGAACAAAG ATCAGTAGCT AAACATATGG TACAAACATA CCATCTCGCA 
(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS : single 



(D) TOPOLOGY : linear 



(C) STRANDEDNESS : single 



(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
CCTTTAGTTA GGCTAGCTAC AACGATTTTT CCCTGCTTGG CAACGACAC 4 
(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: 
CTCCCTACGT TACACCAGCG GTACGAATTT TCCACGAGAG GTAATCCGCA 5 
(2) INFORMATION FOR SEQ ID NO:83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANT I - SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 



CGGCACCTCT AGTTAGACAC TCCGGAATTT TTCCCC 
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(2) INFORMATION FOR SEQ ID NO: 84: 



(i) 



SEQUENCE CHARACTERISTICS : 



(A) LENGTH: 4 9 base pairs 



(B) TYPE: nucleic acid 



(C) STRANDEDNESS : single 



( D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
CGGCACCTCT AGTTAGACAC TCCGGAATTT TAGCCTACCA TAGTCCGGT 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: nucleic acid 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
CCCTTTGGTT AGGCTAGCTA CAACGATTTT TCCCTGCTTG AATTGTA 
(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS : single 



(D) TOPOLOGY: linear 
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<C) STRANDEDNESS : single 
<D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 86 : 
CCCTTTGGTT AGGCTAGCTA CAACGATTTT TCCCTGCTTG AC CTGTTACG A 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
CCTTTAGTTA GGCTAGCTAC AACGATTTTT CCCTGCTTGG AACGACAC 
(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

5 CATGGCTTAA TCATCCTCAA TAGAAGACTA CAAGTCGAAT ATGTCCCCCC 50 

(2) INFORMATION FOR SEQ ID NO: 89: 

( i ) S EQUENCE CHARACTER I S T I CS : 
10 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
CAACAGAGCG AGTATCACCC CCTGTCAATA GTCGTATGAA ACATTGGGCC 5 0 

(2) INFORMATION FOR SEQ ID NO: 90: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



30 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
TACCGACAAG GGGAATTAAA AGCTAGCTGG TTATGCAACC CTTTTCGCA 



49 
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(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 



15 



CTCGAAACAG TGATATTCTG AACAAACGGG TACTACGTGT TCAGCCCCC 



49 



(2) INFORMATION FOR SEQ ID NO: 92: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92 



CCAATAACGT AACCCGGTTA GATAAGCACT TAGCTAAGAT GTTTATCCTG 



50 



(2) INFORMATION FOR SEQ ID NO: 93: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

<iv) ANTI-SENSE: NO 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

10 CAATACAATC GGTACGAATC CAGAAACATA ACGTTGTTTC AGAATGGTCC 50 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANT I - SENSE : NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 94 : 
GCAACAACAA GAAC CAAGTT ACATACACGT TCATCTATAC TGAACCCCCA 50 
(2) INFORMATION FOR SEQ ID NO: 95: 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

5 CCTTTGAGTT CCTAAATGCC GCACGGTAAG CTTGGCACAC TTTGACTGTA 50 

(2) INFORMATION FOR SEQ ID NO: 96: 

<i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 



CAAAGATCTC ACTTTGGAAA TGCGAAATAT GTATATTCGC CCTGTCTGC 4 9 



(2) INFORMATION FOR SEQ ID NO: 97: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE : NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

CCACGTAGAA TTATCTGATT TATAACATAA CGCAGGATAA CTCTCGCCCA 50 
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(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 8 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 <iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

15 CACAAGAAAG TGTCGTCTCC AGATATTTGA GTACAAGGAA CTACGCCC 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 tii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



30 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
CATGAAGAAA T AGG AC ATT C TACAGGCTGG ACCGTTACTA TGCCTGTAGG 
(2) INFORMATION FOR SEQ ID NO: 100: 



48 



50 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

10 CATAGGATAA TCATGGCGAT GCTTATGACG TGTACATCTA TACCTT 4 6 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
CAGATGATCT TCCTTTAAAG ACTACCCTTT AAAGAAACAT AAGGTACCCC 



50 
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We Claim: 

1 . A catalytic DNA molecule having site-specific endonuclease activity. 

2. The catalytic DNA molecule of claim 1, wherein said endonuclease 
activity is specific for a nucleotide sequence defining a cleavage site comprising single- 

5 stranded nucleic acid in a substrate nucleic acid sequence. 

3. The catalytic DNA molecule of claim 2, wherein said single stranded 
nucleic acid comprises RNA, DNA, modified RNA, modified DNA. nucleotide analogs, or 

composites thereof. 

4. The catalytic DNA molecule of claim 2, wherein said substrate nucleic 
10 acid comprises RNA, DNA, modified RNA, modified DNA, nucleotide analogs, or 

composites thereof. 

5. The catalytic DNA molecule of claim 2, wherein said endonuclease 
activity comprises hydrolytic cleavage of a phosphoester bond at said cleavage site. 

6. The catalytic DNA molecule of claim 1. wherein said molecule is single- 
15 stranded. 

7. The catalytic DNA molecule of claim 1 , wherein said molecule includes 
one or more hairpin loop structures. 

8. The catalytic DNA molecule of claim 1, wherein said substrate nucleic 
acid sequence is attached to said catalytic DNA molecule. 

20 9. The catalytic DNA molecule of claim 1, wherein said substrate nucleic 

acid sequence is not attached to said catalytic DNA molecule. 

10. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule comprises a nucleotide sequence selected from the group consisting of: 
SEQ ID NO 3 and SEQ ID NOS 14 through 22. 
25 11. The catalytic DNA molecule of claim 1 , wherein said catalytic DNA 

molecule comprises a nucleotide sequence selected from the group consisting of: 
SEQ ID NOS 23 through 30. 

12. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule comprises a nucleotide sequence selected from the group consisting of: 

30 SEQ ID NOS 31 through 39. 

13. The catalytic DNA molecule of claim 1 , wherein said catalytic DNA 
molecule comprises a nucleotide sequence selected from the group consisting of: 

SEQ ID NOS 52 through 101 . 

14. The catalytic DNA molecule of claim 1 1 , 1 2, or 1 3, wherein said 
35 endonuclease activity is enhanced by the presence of Mg 2 '. 

15. The catalytic DNA molecule of claim 1 , wherein said catalytic DNA 
molecule has a substrate binding affinity of about 1 fjM or less. 
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16. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule binds substrate with a K 0 of less than about 0.1 //M. 

17. The catalytic DNA molecule of claim 2, wherein said nucleotide 
sequence defining said cleavage site comprises at least one nucleotide. 

5 18. The catalytic DNA molecule of claim 1, wherein said endonuclease 

activity is enhanced by the presence of a divalent cation. 

19. The catalytic DNA molecule of claim 18, wherein said divalent cation is 
selected from the group consisting of Pb 2 + , Mg 2 *, Mn 24 , Zn 2 + , and Ca 2 + . 

20. The catalytic DNA molecule of claim 1, wherein said endonuclease 
10 activity is enhanced by the presence of a monovalent cation. 

21. The catalytic DNA molecule of claim 20, wherein said monovalent cation 
is selected from the group consisting of Na* and K + . 

22. The catalytic DNA molecule of claim 1, wherein said catalytic DNA 
molecule comprises a conserved core flanked by first and second substrate binding 

15 regions. 

23. The catalytic DNA molecule of claim 22, further comprising one or more 
spacer nucleotides between said conserved core and said substrate binding region. 

24. The catalytic DNA molecule of claim 22, wherein said conserved core 
comprises one or more conserved regions. 

20 25. The catalytic DNA molecule of claim 24, wherein said one or more 

conserved regions includes a nucleotide sequence selected from the group consisting of: 

CG; 

CGA; 

AGCG; 
25 AGCCG; 

CAGCGAT; 

CTTGTTT; and 

CTTATTT. 

26. The catalytic DNA molecule of claim 24, further comprising one or more 
30 variable or spacer nucleotides between said conserved regions in said conserved core. 

27. The catalytic DNA molecule of claim 22, wherein said first substrate 
binding region includes a nucleotide sequence selected from the group consisting of: 

CATCTCT; 
GCTCT; 

35 TTGCTTTTT; 

TGTCTTCTC; 
TTGCTGCT; 
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GCCATGCTTT; 
CTCTATTTCT 
GTCGGCA; 
CATCTCTTC; and 
5 ACTTCT. 

28. The catalytic DNA molecule of claim 22, wherein said second substrate 
binding region includes a nucleotide sequence selected from the group consisting of: 

TATGTGACGCTA; 

TATAGTCGTA; 
10 ATAGCGTATTA; 

ATAGTTACGTCAT; 

AATAGTGAAGTGTT; 

TATAGTGTA; 

ATAGTCGGT; 
1 5 ATAGGCCCGGT; 

AATAGTGAGGCTTG; and 

ATGNTG. 

29. The catalytic DNA molecule of claim 22, further comprising a third 
substrate binding region, wherein said third region includes a nucleotide sequence 

20 selected from the group consisting of: 

TGTT; 
TGTTA; and 
TGTT AG. 

30. The catalytic DNA molecule of claim 29, further comprising one or more 
25 spacer regions between said substrate binding regions. 

31 . A composition comprising two or more populations of catalytic DNA 
molecules according to claim 1. wherein each population of catalytic DNA molecules is 
capable of cleaving a different nucleotide sequence in a substrate. 

32. A composition comprising two or more populations of catalytic DNA 
30 molecules according to claim 1 , wherein each population of catalytic DNA molecules is 

capable of recognizing a different substrate. 

33. A method of selecting a catalytic DNA molecule that cleaves a substrate 
nucleic acid sequence at a specific site, comprising the following steps: 

a. obtaining a population of single-stranded DNA molecules; 
35 b . admixing nucleotide-containing substrate molecules with said population 

of single-stranded DNA molecules to form an admixture; 
c. maintaining said admixture for a sufficient period of time and under 
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predetermined reaction conditions to allow single-stranded DNA molecules in 
said population to cause cleavage of said substrate sequences, thereby 
producing substrate cleavage products; 

d. separating said population of single-stranded DNA molecules from said 
5 substrate sequences and substrate cleavage products; and 

e. isolating single-stranded DNA molecules that cleave nucleotide- 
containing substrate at a specific site from said population. 

34. The method of claim 33, wherein said substrate comprises RNA. 

35. The method of claim 33, wherein said DNA molecules that cleave said 
10 substrate at a specific site are tagged with an immobilizing agent. 

36. The method of claim 35, wherein said agent comprises biotin. 

37. The method of claim 35, wherein said isolating step further comprises 
exposing said tagged DNA molecules to a solid surface having avidin linked thereto, 
whereby said tagged DNA molecules become attached to said solid surface. 

1 5 38. A method of cleaving a phosphoester bond, comprising: 

a. admixing a catalytic DNA molecule capable of cleaving a substrate 
nucleic acid sequence at a defined cleavage site with a phosphoester bond- 
containing substrate, to form a reaction admixture; and 

b. maintaining said admixture under predetermined reaction conditions to 
20 allow said catalytic DNA molecule to cleave said phosphoester bond, thereby 

producing a population of substrate products. 

39. The method of claim 38, further comprising the steps of 

a. separating said products from said catalytic DNA molecule; and 

b. adding additional substrate to said catalytic DNA molecule to form a new 
25 reaction admixture. 

40. The method of claim 38, wherein said substrate comprises RNA. 

41. The method of claim 38, wherein said predetermined reaction conditions 
include the presence of a monovalent cation, a divalent cation, or both. 

42. A method of engineering catalytic DNA molecules that cleave 
30 phosphoester bonds, comprising the following steps: 

a. obtaining a population of single-stranded DNA molecules; 

b. introducing genetic variation into said population to produce a variant 
population; 

c. selecting individuals from said variant population that meet 
35 predetermined selection criteria; 

d. separating said selected individuals from the remainder of said variant 
population; and 
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e. 



amplifying said selected individuals. 



43. A non-naturally-occurring catalytic DNA molecule comprising a 
nucleotide sequence defining a conserved core flanked by one or more recognition 
domains, variable regions, and spacer regions. 

44. The catalytic DNA molecule of claim 43, wherein said nucleotide 
sequence defines a first variable region contiguous or adjacent to the SMerminus of the 
molecule, a first recognition domain located 3'-terminal to the first variable region, a first 
spacer region located 3'-terminal to the first recognition domain, a first conserved region 
located 3*-terminal to the first spacer region, a second spacer region located 3'-terminal 
to the first conserved region, a second conserved region located 3'-terminal to the 
second spacer region, a second recognition domain located 3'-terminal to the second 
conserved region, and a second variable region located 3 f -terminal to the second 
recognition domain. 

45. The catalytic DNA molecule of claim 43, wherein said nucleotide 
sequence defines a first variable region contiguous or adjacent to the B'-terminus of the 
molecule, a first recognition domain located 3'-terminal to the first variable region, a first 
spacer region located 3*-terminal to the first recognition domain, a first conserved region 
located 3'-terminal to the first spacer region, a second spacer region located 3'-terminal 
to the first conserved region, a second conserved region located 3'-terminal to the 
second spacer region, a second recognition domain located 3'-terminal to the second 
conserved region, a second variable region located 3'-terminal to the second recognition 
domain, and a third recognition domain located 3'-terminal to the second variable region. 
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ENZYMATIC DNA MOLECULES 

TECHNICAL FIELD 

The present invention relates to nucleic acid enzymes or catalytic (enzymatic) 
5 DNA molecules that are capable of cleaving other nucleic acid molecules, particularly 

RNA. The present invention also relates to compositions containing the disclosed 
enzymatic DNA molecules and to methods of making and using such enzymes and 
compositions. 

BACKGROUND 

10 The need for catalysts that operate outside of their native context or which 

catalyze reactions that are not represented in nature has resulted in the development of 
"enzyme engineering" technology. The usual route taken in enzyme engineering has 
been a "rational design" approach, relying upon the understanding of natural enzymes to 
aid in the construction of new enzymes. Unfortunately, the state of proficiency in the 

1 5 areas of protein structure and chemistry is insufficient to make the generation of novel 

biological catalysts routine. 

Recently, a different approach for developing novel catalysts has been applied. 
This method involves the construction of a heterogeneous pool of macromolecules and 
the application of an in vitro selection procedure to isolate molecules from the pool that 

20 catalyze the desired reaction. Selecting catalysts from a pool of macromolecules is not 

dependent on a comprehensive understanding of their structural and chemical 
properties. Accordingly, this process has been dubbed "irrational design" (Brenner and 
Lerner, PNAS USA 89 : 5381-5383 (1992)). 

Most efforts to date involving the rational design of enzymatic RNA molecules or 

25 ribozymes have not led to molecules with fundamentally new or improved catalytic 

function. However, the application of irrational design methods via a process we have 
described as "directed molecular evolution" or "in vitro evolution", which is patterned 
after Darwinian evolution of organisms in nature, has the potential to lead to the 
production of DNA molecules that have desirable functional characteristics. 

30 This technique has been applied with varying degrees of success to RNA 

molecules in solution (see, e.g.. Mills, et al. f PNAS USA 58: 217 (1967); Green, et al., 
Nature 347 : 406 (1990); Chowrira, et al., Nature 354 : 320 (1991); Joyce, Gene 82: 83 

(1989) ; Beaudry and Joyce, Science 257 : 635-641 (1992); Robertson and Joyce, 
Nature 344 : 467 (1990)), as well as to RNAs bound to a ligand that is attached to a 

35 solid support (Tuerk, et al.. Science 249 : 505 (1990); Ellington, et al., Nature 346: 818 

(1990) ). It has also been applied to peptides attached directly to a solid support (Lam, 
et al., Nature 354 : 82 (1991)); and to peptide epitopes expressed within a viral coat 
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protein (Scott, et al., Science ?49: 386 (1990); Devlin, et al., Science 249 : 249 (1990); 
Cwirla, et al., PNAS USA 87 : 6378 (1990)). 

It has been more than a decade since the discovery of catalytic RNA (Kruger, et 
al., Cgll 31: 147-157 (1982); Guerrier-Takada, et al.. Cell 35 : 849-857 (1983)). The list 
5 of known naturally-occurring ribozymes continues to grow (see Cech, in The RNA World . 

Gesteland & Atkins (eds.), pp. 239-269, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY (1993); Pyle, Science 261 : 709-714 (1993); Symons, Curr. Ooin. 
StfUCti B'Oli 4: 322-330 (1994)) and, in recent years, has been augmented by synthetic 
ribozymes obtained through in vitro evolution. (See, e.g., Joyce, Curr. Qpin. Struct. 

10 Biol. 4: 331-336 (1994); Breaker & Joyce, Trends Biotech. 12 : 268-275 (1994); 

Chapman & Szostak, Curr. Onin. Struct. Biol 4; 618-622 (1994).) 

It seems reasonable to assume that DNA can have catalytic activity as well, 
considering that it contains most of the same functional groups as RNA. However, with 
the exception of certain viral genomes and replication intermediates, nearly all of the 

15 DNA in biological organisms occurs as a complete duplex, precluding it from adopting a 

complex secondary and tertiary structure. Thus it is not surprising that DNA enzymes 
have not been found in nature. 

Until the advent of the present invention, the design, synthesis and use of 
catalytic DN. molecules with nucleotide-cleaving capabilities has not been disclosed or 

20 demonstrates. Therefore, the discoveries and inventions disclosed herein are 

particularly significant, in that they highlight the potential of in vitro evolution as a 
means of designing increasingly more efficient catalytic molecules, including enzymatic 
DNA molecules that cleave other nucleic acids, particularly RNA. 

BRIEF SUMMARY OF THE INVENTION 

25 The present invention thus contemplates a synthetic or engineered (i.e., non- 

naturally-occurring) catalytic DNA molecule (or enzymatic DNA molecule) capable of 
cleaving a substrate nucleic acid (NA) sequence at a defined cleavage site. The 
invention also contemplates an enzymatic DNA molecule having an endonuclease 
activity. 

30 In one preferred variation, the endonuclease activity is specific for a nucleotide 

sequence defining a cleavage site comprising single-stranded nucleic acid in a substrate 
nucleic acid sequence. In another preferred variation, the cleavage site is double- 
stranded nucleic acid. Similarly, substrate nucleic acid sequences may be single- 
stranded, double-stranded, partially single- or double-stranded, looped, or any 

35 combination thereof. 

In another contemplated embodiment, the substrate nucleic acid sequence 
includes one or more nucleotide analogues. In one variation, the substrate nucleic acid 



WO 96/17086 ^ PCT/US95/15580 



-3- 

sequence is a portion of, or attached to, a larger molecule. 

In various embodiments, the larger molecule is selected from the group 
consisting of RNA, modified RNA, DNA, modified DNA, nucleotide analogs, or 
composites thereof. In another example, the larger molecule comprises a composite of a 
5 nucleic acid sequence and a non-nucleic acid sequence. 

In another embodiment, the invention contemplates that a substrate nucleic acid 
sequence includes one or more nucleotide analogs. A further variation contemplates 
that the single stranded nucleic acid comprises RNA, DNA, modified RNA, modified 
DNA, one or more nucleotide analogs, or any composite thereof. In one embodiment of 

10 the disclosed invention, the endonuclease activity comprises hydrolytic cleavage of a 

phosphoester bond at the cleavage site. 

In various preferred embodiments, the catalytic DNA molecules of the present 
invention are single-stranded in whole or in part. These catalytic DNA molecules may 
preferably assume a variety of shapes consistent with their catalytic activity. Thus, in 

15 one variation, a catalytic DNA molecule of the present invention includes one or more 

hairpin loop structures. In yet another variation, a catalytic DNA molecule may assume 
a shape similar to that of "hammerhead" ribozymes. In still other embodiments, a 
catalytic DNA molecule may assume a conformation similar to that of Tetrahymena 
thermophila ribozymes, e.g., those derived from group I introns. 

20 Similarly, preferred catalytic DNA molecules of the present invention are able to 

demonstrate site-specific endonuclease activity irrespective of the original orientation of 
the substrate molecule. Thus, in one preferred embodiment, an enzymatic DNA 
molecule of the present invention is able to cleave a substrate nucleic acid sequence 
that is separate from the enzymatic DNA molecule -- i.e., it is not linked to the 

25 DNAzyme. In another preferred embodiment, an enzymatic DNA molecule is 8ble to 

cleave an attached substrate nucleic acid sequence - i.e., it is able to perform a reaction 
similar to self-cleavage. 

The invention also contemplates enzymatic DNA molecules (catalytic DNA 
molecules, deoxyribozymes or DNAzymes) having endonuclease activity, whereby the 

30 endonuclease activity requires the presence of a divalent cation. In various preferred, 

alternative embodiments, the divalent cation is selected from the group consisting of 
Pb 2 + , Mg 2 + , Mn 2+ , Zn 2 + , and Ca 2 + . Another variation contemplates that the 
endonuclease activity requires the presence of a monovalent cation. In such alternative 
embodiments, the monovalent cation is preferably selected from the group consisting of 

35 Na + andK + . 

In various preferred embodiments of the invention, an enzymatic DNA molecule 
comprises a nucleotide sequence selected from the group consisting of SEQ ID NO 3, 
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SEQ ID NO 14; SEQ ID NO 15; SEQ ID NO 16; SEQ ID NO 17; SEQ ID NO 18; SEQ ID 
NO 19; SEQ ID NO 20; SEQ ID NO 21; and SEQ ID NO 22. In other preferred 
embodiments, a catalytic DNA molecule of the present invention comprises a nucleotide 
sequence selected from the group consisting of SEQ ID NO 23; SEQ ID NO 24; SEQ ID 
NO 25; SEQ ID NO 26; SEQ ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ ID NO 30; 
SEQ ID NO 31; SEQ ID NO 32; SEQ ID NO 33; SEQ ID NO 34; SEQ ID NO 35; SEQ ID 
NO 36; SEQ ID NO 37; SEQ ID NO 38; and SEQ ID NO 39. 

Another preferred embodiment contemplates that a catalytic DNA molecule of 
the present invention comprises a nucleotide sequence selected from the group 
consisting of SEQ ID NO 50 and SEQ ID NO 51 . In yet another preferred embodiment, a 
catalytic DNA molecule of the present invention comprises a nucleotide sequence 
selected from the group consisting of SEQ ID NOS 52 through 101. As disclosed 
herein, catalytic DNA molecules having sequences substantially similar to those 
disclosed herein are also contemplated. Thus, a wide variety of substitutions, deletions, 
insertions, duplications and other mutations may be made to the within-described 
molecules in order to generate a variety of other useful enzymatic DNA molecules; as 
long as said molecules display site-specific cleavage activity as disclosed herein, they 
are within the boundaries of this disclosure. 

In a further variation of the present invention, an enzymatic DNA molecule of the 
present invention preferably has a substrate binding affinity of about 1 //M or less. In 
another embodiment, an enzymatic DNA molecule of the present invention binds 
substrate with a K D of less than about 0.1 //M. 

The present invention also discloses enzymatic DNA molecules having useful 
turnover rates. In one embodiment, the turnover rate is less than 5 hr 1 ; in a preferred 
embodiment, the rate is less than about 2 hr 1 ; in a more preferred embodiment, the rate 
is less than about Ihr 1 ; in an even more preferred embodiment, the turnover rate is 
about 0.6 hr 1 or less. 

In still another embodiment, an enzymatic DNA molecule of the present 
invention displays a useful turnover rate wherein the k^, is less than 1 min \ preferably 
less than 0.1 min' 1 ; more preferably, less than 0.01 min T ; and even more preferably, 
less than 0.005 min* 1 . In one variation, the value of k obs is approximately 0.002 min 1 or 
less. 

The present invention also contemplates embodiments in which the catalytic rate 
of the disclosed DNA enzymes is fully optimized. Thus, in various preferred 
embodiments, the K m for reactions enhanced by the presence of Mg 2+ is approximately 
0.5-20 mM, preferably about 1-10 mM, and more preferably about 2-5 mM. 

The present invention also contemplates an embodiment whereby the nucleotide 
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sequence defining the cleavage site comprises at least one nucleotide. In various other 
preferred embodiments, a catalytic ONA molecule of the present invention is able to 
recognize and cleave a nucleotide sequence defining a cleavage site of two or more 
nucleotides. 

5 In various preferred embodiments, an enzymatic DNA molecule of the present 

invention comprises a conserved core flanked by one or more substrate binding regions. 
In one embodiment, an enzymatic DNA molecule includes first and second substrate 
binding regions. In another embodiment, an enzymatic DNA molecule includes two or 
more substrate binding regions. 

10 As noted previously, preferred catalytic DNA molecules of the present invention 

may also include a conserved core. In one preferred embodiment, the conserved core 
comprises one or more conserved regions. In other preferred variations, the one or more 
conserved regions include a nucleotide sequence selected from the group consisting of 
CG; CGA; AGCG; AGCCG; CAGCGAT; CTTGTTT; and CTTATTT (see, e.g.. Fig. 3). 

15 In one embodiment of the invention, an enzymatic DNA molecule of the present 

invention further comprises one or more variable or spacer nucleotides between the 
conserved regions in the conserved core. In another embodiment, an enzymatic DNA 
molecule of the present invention further comprises one or more variable or spacer 
nucleotides between the conserved core and the substrate binding region. 

20 In one variation, the first substrate binding region preferably includes a 

nucleotide sequence selected from the group consisting of CATCTCT; GCTCT; 
TTGCTTTTT; TGTCTTCTC; TTGCTGCT; GCCATGCTTT (SEQ ID NO 40); CTCTATTTCT 
{SEQ ID NO 41); GTCGGCA; CATCTCTTC; and ACTTCT. In another preferred variation, 
the second substrate binding region includes a nucleotide sequence selected from the 

25 group consisting of TATGTGACGCTA (SEQ ID NO 42); TATAGTCGTA (SEQ ID NO 43); 

ATAGCGTATTA (SEQ ID NO 44); ATAGTTACGTCAT (SEQ ID NO 45); 
AATAGTGAAGTGTT (SEQ ID NO 46); TATAGTGTA; ATAGTCGGT; ATAGGCCCGGT 
(SEQ ID NO 47); AATAGTGAGGCTTG (SEQ ID NO 48); and ATGNTG. 

In various embodiments of the present invention, the substrate binding regions 

30 vary in length. Thus, for example, a substrate binding region may comprise a single 

nucleotide to dozens of nucleotides. However, it is understood that substrate binding 
regions of about 3-25 nucleotides in length, preferably about 3-15 nucleotides in length, 
and more preferably about 3-10 nucleotides in length are particularly preferred. In 
various embodiments, the individual nucleotides in the substrate binding regions are able 

35 to form complementary base pairs with the nucleotides of the substrate molecules; in 

other embodiments, noncomplementary base pairs are formed. A mixture of 
complementary and noncomplementary base pairing is also contemplated as falling 
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within the scope of the disclosed embodiments of the invention. 

In another preferred embodiment, a catalytic DNA molecule of the present 
invention may further comprise a third substrate binding region. In some preferred 
embodiments, the third region includes a nucleotide sequence selected from the group 
5 consisting of TGTT; TGTTA; and TGTTAG. Another preferred embodiment of the 

present invention discloses an enzymatic DNA molecule further comprising one or more 
variable or -spacer** regions between the substrate binding regions. 

In another disclosed embodiment, the present invention contemplates a purified, 
synthetic enzymatic DNA molecule separated from other DNA molecules and 

10 oligonucleotides, the enzymatic DNA molecule having an endonuclease activity, wherein 

the endonuclease activity is specific for a nucleotide sequence defining a cleavage site 
comprising single- or double-stranded nucleic acid in a substrate nucleic acid sequence. 
In one variation, a synthetic (or engineered) enzymatic DNA molecule having an 
endonuclease activity is disclosed, wherein the endonuclease activity is specific for a 

1 5 nucleotide sequence defining a cleavage site consisting essentially of a single- or double- 

stranded region of a substrate nucleic acid sequence. 

In yet another embodiment, the invention contemplates an enzymatic DNA 
molecule comprising a deoxyribonucleotide polymer having a catalytic activity for 
hydrolyzing a nucleic acid-containing substrate to produce substrate cleavage products. 

20 In one variation, the hydrolysis takes place in a site-specific manner. As noted 

previously, the polymer may be single-stranded, double-stranded, or some combination 
of both. 

The invention further contemplates that the substrate comprises a nucleic acid 
sequence. In various embodiments, the nucleic acid sequence substrate comprises 

25 RNA, modified RNA, DNA, modified DNA, one or more nucleotide analogs, or 

composites of any of the foregoing. One embodiment contemplates that the substrate 
includes a single-stranded segment; still another embodiment contemplates that the 
substrate is double-stranded. 

The present invention also contemplates an enzymatic DNA molecule comprising 

30 a deoxyribonucleotide polymer having a catalytic activity for hydrolyzing a nucleic acid- 

containing substrate to produce a cleavage product. In one variation, the enzymatic 
DNA molecule has an effective binding affinity for the substrate and lacks an effective 
binding affinity for the cleavage product. 

In one preferred embodiment, the invention discloses a non-naturally-occurring 

35 enzymatic DNA molecule comprising a nucleotide sequence defining a conserved core 

flanked by recognition domains, variable regions, and spacer regions. Thus, in one 
preferred embodiment, the nucleotide sequence defines a first variable region contiguous 



WO 96/17086 



PCT/US95/15580 



-7- 

or adjacent to the 5'-terminus of the molecule, a first recognition domain located 3'- 
terminal to the first variable region, a first spacer region located 3'-terminal to the first 
recognition domain, a first conserved region located 3'-terminal to the first spacer 
region, a second spacer region located 3'-terminal to the first conserved region, a 
second conserved region located 3*-terminal to the second spacer region, a second 
recognition domain located 3*-terminal to the second conserved region, and a second 
variable region located 3'-terminal to the second recognition domain. 

In another embodiment, the nucleotide sequence preferably defines a first 
variable region contiguous or adjacent to the B'-terminus of the molecule, a first 
recognition domain located 3'-terminal to the first variable region, a first spacer region 
located 3'-terminal to the first recognition domain, a first conserved region located 3'- 
terminal to the first spacer region, a second spacer region located 3'-terminal to the first 
conserved region, a second conserved region located 3'-terminal to the second spacer 
region, a second recognition domain located 3'-terminal to the second conserved region, 
a second variable region located S'-terminal to the second recognition domain, and a 
third recognition domain located S'-terminal to the second variable region. 

In one variation of the foregoing, the molecule includes a conserved core region 
flanked by two substrate binding domains; in another, the conserved core region 
comprises one or more conserved domains. In other preferred embodiments, the 
conserved core region further comprises one or more variable or spacer nucleotides. f In 
yet another embodiment, an enzymatic DNA molecule of the present invention further 
comprises one or more spacer regions. 

The present invention further contemplates a wide variety of compositions. For 
example, compositions including an enzymatic DNA molecule as described hereinabove 
are disclosed and contemplated herein. In one alternative embodiment, a composition 
according to the present invention comprises two or more populations of enzymatic 
DNA molecules as described above, wherein each population of enzymatic DNA 
molecules is capable of cleaving a different sequence in a substrate. In another 
variation, a composition comprises two or more populations of enzymatic DNA 
molecules as described hereinabove, wherein each population of enzymatic DNA 
molecules is capable of recognizing a different substrate. In various embodiments, it is 
also preferred that compositions include a monovalent or divalent cation. 

The present invention further contemplates methods of generating, selecting, 
and isolating enzymatic DNA molecules of the present invention. In one variation, a 
method of selecting enzymatic DNA molecules that cleave a nucleic acid sequence (e.g., 
RNA) at a specific site comprises the following steps: (a) obtaining a population of 
putative enzymatic DNA molecules - whether the sequences are naturally-occurring or 
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synthetic and preferably, they are single-stranded DNA molecules; (b) admixing 
nucleotide-containing substrate sequences with the aforementioned population of DNA 
molecules to form an admixture; (c) maintaining the admixture for a sufficient period of 
time and under predetermined reaction conditions to allow the putative enzymatic DNA 
5 molecules in the population to cause cleavage of the substrate sequences, thereby 

producing substrate cleavage products; (d) separating the population of DNA molecules 
from the substrate sequences and substrate cleavage products; and (e) isolating DNA 
molecules that cleave substrate nucleic acid sequences (e.g., RNA) at a specific site 
from the population. 

10 In a further variation of the foregoing method, the DNA molecules that cleave 

substrate nucleic acid sequences at a specific site are tagged with an immobilizing 
agent. In one example, the agent comprises biotin. 

In yet another variation of the aforementioned method, one begins by selecting a 
sequence e.g., a predetermined "target" nucleotide sequence that one wishes to 

15 cleave using an enzymatic DNA molecule engineered for that purpose. Thus, in one 

embodiment, the pre-selected (or predetermined) ''target' 9 sequence is used to generate 
a population of DNA molecules capable of cleaving substrate nucleic acid sequences at 
a specific site via attaching or "tagging" it to a deoxyribonucleic acid sequence 
containing one or more randomized sequences or segments. In one variation, the 

20 randomized sequence is about 40 nucleotides in length; in another variation, the 

randomized sequence is about 50 nucleotides in length. Randomized sequences that are 
1-40, 40-50, and 50-100 nucleotides in length are also contemplated by the present 
invention. 

In one embodiment of the present invention, the nucleotide sequence used to 
25 generate a population of enzymatic DNA molecules is selected from the group consisting 

of SEQ ID NO 4, 23, 50 AND 51 . In another embodiment, the "target" or "substrate" 
nucleotide sequence comprises a sequence of one or more ribonucleotides — see, e.g., 
the relevant portions of SEQ ID NOS 4 and 23, and SEQ ID NO 49. It is also 
contemplated by the present invention that a useful "target" or "substrate" nucleotide 
30 sequence may comprise DNA, RNA, or a composite thereof. 

The invention also contemplates methods as described above, wherein the 
isolating step further comprises exposing the tagged DNA molecules to a solid surface 
having avidin linked thereto, whereby the tagged DNA molecules become attached to 
the solid surface. As before, the substrate may be RNA, DNA, a composite of both, or 
35 a molecule including nucleotide sequences. 

The present invention also contemplates a method for specifically cleaving a 
substrate nucleic acid sequence at a particular cleavage site, comprising the steps of (a) 
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providing an enzymatic DNA molecule capable of cleaving a substrate nucleic acid 
sequence at a specific cleavage site; and (b) contacting the enzymatic DNA molecule 
with the substrate nucleic acid sequence to cause specific cleavage of the nucleic acid 
sequence at the cleavage site. In one variation, the enzymatic DNA molecule is a non- 
naturally-occurring (or synthetic) DNA molecule. In another variation, the enzymatic 
DNA molecule is single-stranded. 

In still another variation of the foregoing method, the substrate comprises a 
nucleic acid. In various embodiments, the substrate nucleic acid comprises RNA, 
modified RNA, DNA, modified DNA, one or more nucleotide analogs, or composites of 
any of the foregoing. In yet another embodiment, the specific cleavage is caused by the 
endonuclease activity of the enzymatic DNA molecule. Alteration of reaction conditions 
« e.g., the adjustment of pH, temperature, percent cation, percent enzyme, percent 
substrate, and percent product — is also contemplated herein. 

The present invention also contemplates a method of cleaving a phosphoester 
bond, comprising (a) admixing an catalytic DNA molecule capable of cleaving a 
substrate nucleic acid sequence at a defined cleavage site with a phosphoester bond- 
containing substrate, to form a reaction admixture; and (b) maintaining the admixture 
under predetermined reaction conditions to allow the enzymatic DNA molecule to cleave 
the phosphoester bond, thereby producing a population of substrate products. In one 
embodiment, the enzymatic DNA molecule is able to cleave the phosphoester bond in a 
site-specific manner. In another embodiment, the method further comprises the steps of 
(c) separating the products from the catalytic DNA molecule; and (d) adding additional 
substrate to the enzymatic DNA molecule to form a new reaction admixture. 

The present invention also contemplates methods of engineering enzymatic DNA 
molecules that cleave phosphoester bonds. One exemplary method comprises the 
following steps: (a) obtaining a population of single-stranded DNA molecules; (b) 
introducing genetic variation into the population to produce a variant population; (c) 
selecting individuals from the variant population that meet predetermined selection 
criteria; (d) separating the selected individuals from the remainder of the variant 
population; and (e) amplifying the selected individuals. 



Figure 1 illustrates a selective amplification scheme for isolation of DNAs that 
cleave a target RNA phosphoester. As shown, double-stranded DNA that contains a 
stretch of 50 random nucleotides (the molecule with "N 50 " indicated above it) is 
amplified by PGR, employing a 5 '-biotinylated DNA primer that is terminated at the 3 ' 
end by an adenosine ribonucleotide <rA). (The biotin label is indicated via the encircled 
letter m B m .) This primer is extended by Tag polymerase to yield a DNA product that 
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contains a single embedded ribonucleotide. The resulting double-stranded DNA is 
immobilized on a streptavidin matrix and the unbiotinylated DNA strand is removed by 
washing with 0.2 N NaOH. After re-equilibrating the column with a buffered solution, 
the column is washed with the same solution with added 1 mM PbOAc. DNAs that 
undergo Pb 2 *-dependent self-cleavage are released from the column, collected in the 
eluant, and amplified by PCR. The PGR products are then used to initiate the next round 
of selective amplification. 

Figure 2 illustrates self-cleavage activity of the starting pool of DNA (GO) and 
populations obtained after the first through fifth rounds of selection (G1 - G5), in the 
presence of lead cation (Pb 2+ ). The symbol Pre represents 108-nucleotide precursor 
DNA (SEQ ID NO 4); Civ, 28-nucleotide 5 -cleavage product (SEQ ID NO 5); and M, 
primer 3a (SEQ ID NO 6), which corresponds in length to the 5 '-cleavage product. 

Figure 3 illustrates the sequence alignment of individual variants isolated from 
the population after five rounds of selection. The fixed substrate domain is shown at 
the top, with the target riboadenylate identified via an inverted triangle. Substrate 
nucleotides that are commonly involved in presumed base-pairing interactions are 
indicated by vertical bars. Sequences corresponding to the 50 initially-randomized 
nucleotides are aligned antiparallel to the substrate domain. All of the variants are 
3 '-terminated by the fixed sequence 5 '-CGGTAAGCTTGGCAC-3 ' (not shown; SEQ ID 
NO 1). Nucleotides within the initially-randomized region that are presumed to form 
base pairs with the substrate domain are indicated on the right and left sides of the 
Figure; the putative base-pair-forming regions of the enzymatic DNA molecules are 
individually boxed in each sequence shown. Conserved regions are illustrated via the 
two large, centrally-located boxes. 

Figures 4A and 4B illustrate DNA-catalyzed cleavage of an RNA phosphoester in 
an intermolecular reaction that proceeds with catalytic turnover. Figure 4A is a 
diagrammatic representation of the complex formed between the 19mer substrate (3'- 
TCACTATrAGGAAGAGATGG-5*, SEQ ID NO 2) and 38mer DNA enzyme <5'- 
ACACATCTCTGAAGTAGCGCCGCCGTATAGTGACGCTA-3\ SEQ ID NO 3). The 
substrate contains a single adenosine ribonucleotide ( H rA n , adjacent to the arrow), 
flanked by deoxyribonucleotides. The synthetic DNA enzyme is a 38-nucleotide portion 
of the most frequently occurring variant shown in Fig. 3. Highly-conserved nucleotides 
located within the putative catalytic domain are "boxed". As illustrated, one conserved 
sequence is "AGOG", while another is "CG W (reading in the 5'-3* direction). 

Figure 4B shows an Eadie-Hofstee plot used to determine K m (negative slope) 
and V max (y-intercept) for DNA-catalyzed cleavage of (5 '- 32 P]-labeled substrate under 
conditions identical to those employed during in vitro selection. Initial rates of cleavage 
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were determined for reactions involving 5 nM DNA enzyme and either 0.125, 0.5, 1, 2, 
or 4 /iM substrate. 

Figure 5 is a photographic representation showing a polyacrylamide get 
demonstrating specific endoribonuclease activity of four families of selected catalytic 
5 DNAs. Selection of a Pb 2+ -dependent family of molecules was repeated in a side-by- 

side fashion as a control (first group). In the second group, Zn 2+ is used as the cation; 
in group three, the cation is Mn 2+ ; and in the fourth group, the cation is Mg 2 + . A fifth 
site on the gel consists of the cleavage product alone, as a marker. 

As noted, there are three lanes within each of the aforementioned four groups. 

10 In each group of three lanes, the first lane shows the lack of activity of the selected 

population in the absence of the metal cation, the second lane shows the observed 
activity in the presence of the metal cation, and the third lane shows the lack of activity 
of the starting pool (GO). 

Figures 6A and 6B provide two-dimensional illustrations of a "progenitor" 

1 5 catalytic DNA molecule and one of several catalytic DNA molecules obtained via the 

selective amplification methods disclosed herein, respectively. Figure 6A illustrates an 
exemplary molecule from the starting pool, showing the overall configuration of the 
molecules represented by SEQ ID NO 23. As illustrated, various complementary 
nucleotides flank the random (N^) region. Figure 6B is a diagrammatic representation of 

20 one of the Mg 2 * -dependent catalytic DNA molecules (or "DNAzymes") generated via the 

within-described procedures. The location of the ribonucleotide in the substrate nucleic 
acid is indicated via the arrow in both Figs. 6A and 6B. 

Figure 7 illustrates some of the results of ten rounds of in vitro selective 
amplification carried out essentially as described in Example 5 hereinbelow. As shown, 

25 two sites and two families of catalysts emerged as displaying the most efficient 

cleavage of the target sequence. Cleavage conditions were essentially as indicated in 
Fig. 7, namely, 10mM Mg 2+ , pH 7.5, and 37°C; data collected after the reaction ran for 
2 hours is shown. Cleavage <%) is shown plotted against the number of generations 
(here, 0 through 10). The number/prevalence of catalytic DNA molecules capable of 

30 cleaving the target sequence at the indicated sites in the substrate is illustrated via the 

vertical bars, with cleavage at G 1 U AACU AG AG AU shown by the striped bars, and with 
cleavage at GU AACUA I G AGAU illustrated via the open (lightly-shaded) bars. 

Figure 8 illustrates the nucleotide sequences, cleavage sites, and turnover rates 
of two catalytic DNA molecules of the present invention, clones 8-17 and 10-23. 

35 Reaction conditions were as shown, namely, 10mM Mg 2 + , pH 7.5, and 37°C. The 

DNAzyme identified as clone 8-17 is illustrated on the left, with the site of cleavage of 
the RNA substrate indicated by the arrow. The substrate sequence (5* - 
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GGAAAAAGUAACUAGAGAUGGAAG - 3') - which is separate from the DNAzyme (i.e., 
intermolecular cleavage is shown) is labeled as such. Similarly, the DNAzyme 
identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 
substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 
5 17 enzyme, the turnover rate was approximately 0.6 hr 1 ; for the 10-23 enzyme, the 

turnover rate was approximately 1 hr 1 . Noncomplementary pairings are indicated with a 
closed circle (•), whereas complementary pairings are indicated with a vertical line (|). 

Figure 9 further illustrates the nucleotide sequences, cleavage sites, and 
turnover rates of two catalytic DNA molecules of the present invention, clones 8-17 and 

10 10-23. Reaction conditions were as shown, namely, 10mM Mg 2+ , pH 7.5, and 37°C. 

As in Fig. 8, the DNAzyme identified as clone 8-17 is illustrated on the left, with the site 
of cleavage of the RNA substrate indicated by the arrow. The substrate sequence (5* - 
GGAAAAAGUAACUAGAGAUGGAAG - 3') -which is separate from the DNAzyme (i.e., 
intermolecular cleavage is shown) - is labeled as such. Similarly, the DNAzyme 

1 5 identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 

substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 
17 enzyme, k o5t was approximately 0.002 min 1 ; for the 10-23 enzyme, the value of k ob , 
was approximately 0.01 min \ Noncomplementary pairings are indicated with a closed 
circle (•), whereas complementary pairings are indicated with a vertical line (|). 

20 DETAILED DESCRIPTION 

A. Definitions 

As used herein, the term "deoxyribozyme" is used to describe a DNA-containing 
nucleic acid that is capable of functioning as an enzyme. In the present disclosure, the 
term "deoxyribozyme" includes endoribonucleases and endodeoxyribonucleases, 

25 although deoxyribozymes with endoribonuclease activity are particularly preferred. 

Other terms used interchangeably with deoxyribozyme herein are "enzymatic DNA 
molecule", "DNAzyme", or "catalytic DNA molecule", which terms should all be 
understood to include enzymatically active portions thereof, whether they are produced 
synthetically or derived from organisms or other sources. 

30 The term "enzymatic DNA molecules" also includes DNA molecules that have 

complementarity in a substrate-binding region to a specified oligonucleotide target or 
substrate; such molecules also have an enzymatic activity which is active to specifically 
cleave the oligonucleotide substrate. Stated in another fashion, the enzymatic DNA 
molecule is capable of cleaving the oligonucleotide substrate intermolecularly. This 

35 complementarity functions to allow sufficient hybridization of the enzymatic DNA 

molecule to the substrate oligonucleotide to allow the intermolecular cleavage of the 
substrate to occur. While one-hundred percent (100%) complementarity is preferred, 
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complementarity in the range of 75-100% is also useful and contemplated by the 
present invention. 

Enzymatic DNA molecules of the present invention may alternatively be 
described as having nuclease or ribonuclease activity. These terms may be used 
interchangeably herein. 

The term "enzymatic nucleic acid" as used herein encompasses enzymatic RNA 
or DNA molecules, enzymatic RNA-DNA polymers, and enzymatically active portions or 
derivatives thereof, although enzymatic DNA molecules are a particularly preferred class 
of enzymatically active molecules according to the present invention. 

The term "endodeoxyribonuclease", as used herein, is an enzyme capable of 
cleaving a substrate comprised predominantly of DNA. The term "endoribonuclease", as 
used herein, is an enzyme capable of cleaving a substrate comprised predominantly of 
RNA. 

As used herein, the term "base pair" (bp) is generally used to describe a 
partnership of adenine (A) with thymine (T) or uracil (U), or of cytosine (C) with guanine 
(G), although it should be appreciated that less-common analogs of the bases A, T, C, 
and G (as well as U) may occasionally participate in base pairings. Nucleotides that , 
normally pair up when DNA or RNA adopts a double stranded configuration may also be 
referred to herein as "complementary bases". 

"Complementary nucleotide sequence" generally refers to a sequence of 
nucleotides in a single-stranded molecule or segment of DNA or RNA that is sufficiently 
complementary to that on another single oligonucleotide strand to specifically hybridize 
to it with consequent hydrogen bonding. 

"Nucleotide" generally refers to a monomeric unit of DNA or RNA consisting of a 
sugar moiety (pentose), a phosphate group, and a nitrogenous heterocyclic base. The 
base is linked to the sugar moiety via the glycosidic carbon (V carbon of the pentose) 
and that combination of base and sugar is a "nucleoside". When the nucleoside 
contains a phosphate group bonded to the 3* or 5' position of the pentose, it is referred 
to as a nucleotide. A sequence of operatively linked nucleotides is typically referred to 
herein as a "base sequence" or "nucleotide sequence", and their grammatical 
equivalents, and is represented herein by a formula whose left to right orientation is in 
the conventional direction of 5*-terminus to 3*-terminus, unless otherwise specified. 

"Nucleotide analog" generally refers to a purine or pyrimidine nucleotide that 
differs structurally from A, T, G, C, or U, but is sufficiently similar to substitute for the 
normal nucleotide in a nucleic acid molecule. As used herein, the term "nucleotide 
analog" encompasses altered bases, different or unusual sugars (i.e. sugars other than 
the "usual" pentose), or a combination of the two. A listing of exemplary analogs 
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wherein the base has been altered is provided in section C hereinbelow. 

"Oligonucleotide or polynucleotide" generally refers to a polymer of single- or 
double-stranded nucleotides. As used herein, "oligonucleotide" and its grammatical 
equivalents will include the full range of nucleic acids. An oligonucleotide will typically 
refer to a nucleic acid molecule comprised of a linear strand of ribonucleotides. The 
exact size will depend on many factors, which in turn depends on the ultimate 
conditions of use, as is well known in the art. 

As used herein, the term "physiologic conditions" is meant to suggest reaction 
conditions emulating those found in mammalian organisms, particularly humans. While 
variables such as temperature, availability of cations, and pH ranges may vary as 
described in greater detail below, "physiologic conditions" generally comprise a 
temperature of about 35-40°C, with 37°C being particularly preferred, as well as a pH 
of about 7.0-8.0, with 7.5 being particularly preferred, and further comprise the 
availability of cations, preferably divalent and/or monovalent cations, with a 
concentration of about 2-15 mM Mg 2+ and 0-1.0 M Na+ being particularly preferred. 
"Physiologic conditions", as used herein, may optionally include the presence of free 
nucleoside cofactor. As noted previously, preferred conditions are described in greater 
detail below. 

B. Enzvmatic DNA Molecules 

In various embodiments, an enzymatic DNA molecule of the present invention 
may combine one or more modifications or mutations including additions, deletions, and 
substitutions. In alternative embodiments, such mutations or modifications may be 
generated using methods which produce random or specific mutations or modifications. 
These mutations may, for example, change the length of, or alter the nucleotide 
sequence of, a loop, a spacer region or the recognition sequence (or domain). One or 
more mutations within one catalytically active enzymatic DNA molecule may be 
combined with the mutation(s) within a second catalytically active enzymatic DNA 
molecule to produce a new enzymatic DNA molecule containing the mutations of both 
molecules. 

In other preferred embodiments, an enzymatic DNA molecule of the present 
invention may have random mutations introduced into it using a variety of methods well 
known to those skilled in the art. For example, the methods described by Cadwell and 
Joyce ( PCR Methods and Applications 2 : 28-33 (1992)) are particularly preferred for use 
as disclosed herein, with some modifications, as described in the Examples that follow. 
(Also see Cadwell and Joyce, PCR Methods and Applications 3 (SuddI.): S136-S140 
(1994).) According to this modified PCR method, random point mutations may be 
introduced into cloned genes. 
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The aforementioned methods have been used, for example, to mutagenize genes 
encoding ribozymes with a mutation rate of 0.66% ± 0.13% (95% confidence interval) 
per position, as determined by sequence analysis, with no strong preferences observed 
with respect to the type of base substitution. This allows the introduction of random 
5 mutations at any position in the enzymatic DNA molecules of the present invention. 

Another method useful in introducing defined or random mutations is disclosed 
in Joyce and Inoue, Nucleic Acids Research 17 : 71 1-722 (1989). This latter method 
involves excision of a template (coding) strand of a double-stranded DNA, reconstruction 
of the template strand with inclusion of mutagenic oligonucleotides, and subsequent 

10 transcription of the partially-mismatched template. This allows the introduction of 

defined or random mutations at any position in the molecule by including 
polynucleotides containing known or random nucleotide sequences at selected positions. 

Enzymatic DNA molecules of the present invention may be of varying lengths 
and folding patterns, as appropriate, depending on the type and function of the 

15 molecule. For example, enzymatic DNA molecules may be about 15 to about 400 or 

more nucleotides in length, although a length not exceeding about 250 nucleotides is 
preferred, to avoid limiting the therapeutic usefulness of molecules by making them too 
large or unwieldy. In various preferred embodiments, an enzymatic DNA molecule of the 
present invention is at least about 20 nucleotides in length and, while useful molecules 

20 may exceed 100 nucleotides in length, preferred molecules are generally not more than 

about 100 nucleotides in length. 

In various therapeutic applications, enzymatic DNA molecules of the present 
invention comprise the enzymatically active portions of deoxyribozymes. In various 
embodiments, enzymatic DNA molecules of the present invention preferably comprise 

25 not more than about 200 nucleotides. In other embodiments, a deoxyribozyme of the 

present invention comprises not more than about 100 nucleotides. In still other 
preferred embodiments, deoxyribozymes of the present invention are about 20-75 
nucleotides in length, more preferably about 20-65 nucleotides in length. Other 
preferred enzymatic DNA molecules are about 10-50 nucleotides in length. 

30 In other applications, enzymatic DNA molecules may assume configurations 

similar to those of "hammerhead" ribozymes. Such enzymatic DNA molecules are 
preferably no more than about 75-100 nucleotides in length, with a length of about 20- 
50 nucleotides being particularly preferred. 

In general, if one intends to synthesize molecules for use as disclosed herein, the 

35 larger the enzymatic nucleic acid molecule is, the more difficult it is to synthesize. 

Those of skill in the art will certainly appreciate these design constraints. Nevertheless, 
such larger molecules remain within the scope of the present invention. 
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It is also to be understood that an enzymatic DNA molecule of the present 
invention may comprise enzymatically active portions of a deoxyribozyme or may 
comprise a deoxyribozyme with one or more mutations, e.g., with one or more base- 
pair-forming sequences or spacers absent or modified, as long as such deletions, 
5 additions or modifications do not adversely impact the molecule's ability to perform as 

an enzyme. 

The recognition domain of an enzymatic DNA molecule of the present invention 
typically comprises two nucleotide sequences flanking a catalytic domain, and typically 
contains a sequence of at least about 3 to about 30 bases, preferably about 6 to about 

10 15 bases, which are capable of hybridizing to a complementary sequence of bases 

within the substrate nucleic acid giving the enzymatic DNA molecule its high sequence 
specificity. Modification or mutation of the recognition site via well-known methods 
allows one to alter the sequence specificity of an enzymatic nucleic acid molecule. 
(See, e.g, Joyce et al., Nucleic Acids Research 17: 71 1-712 (1989.)) 

1 5 Enzymatic nucleic acid molecules of the present invention also include those 

with altered recognition sites or domains. In various embodiments, these altered 
recognition domains confer unique sequence specificities on the enzymatic nucleic acid 
molecule including such recognition domains. The exact bases present in the 
recognition domain determine the base sequence at which cleavage will take place. 

20 Cleavage of the substrate nucleic acid occurs within the recognition domain. This 

cleavage leaves a 2\ 3\ or 2\3'-cyclic phosphate group on the substrate cleavage 
sequence and a 5' hydroxy! on the nucleotide that was originally immediately 3* of the 
substrate cleavage sequence in the original substrate. Cleavage can be redirected to a 
site of choice by changing the bases present in the recognition sequence (internal guide 

25 sequence). See Murphy et al., Proc. Natl. Acad. Sci. USA 86 : 9218-9222 (1989). 

Moreover, it may be useful to add a polyamine to facilitate recognition and 
binding between the enzymatic DNA molecule and its substrate. Examples of useful 
polyamines include spermidine, putrescine or spermine. A spermidine concentration of 
about 1 mM may be effective in particular embodiments, while concentrations ranging 

30 from about 0.1 mM to about 10 mM may also be useful. 

In various alternative embodiments, an enzymatic DNA molecule of the present 
invention has an enhanced or optimized ability to cleave nucleic acid substrates, 
preferably RNA substrates. As those of skill in the art will appreciate, the rate of an 
enzyme-catalyzed reaction varies depending upon the substrate and enzyme 

35 concentrations and, in general, levels off at high substrate or enzyme concentrations. 

Taking such effects into account, the kinetics of an enzyme-catalyzed reaction may be 
described in the following terms, which define the reaction. 



WO 96/17086 PCT/US95/15580 



-17- 

The enhanced or optimized ability of an enzymatic ONA molecule of the present 
invention to cleave an RNA substrate may be determined in a cleavage reaction with 
varying amounts of labeled RNA substrate in the presence of enzymatic DNA molecule. 
The ability to cleave the substrate is generally defined by the catalytic rate (k C8t ) divided 
5 by the Michaelis constant (K M ). The symbol k CBt represents the maximal velocity of an 

enzyme reaction when the substrate approaches a saturation value. K M represents the 
substrate concentration at which the reaction rate is one-half maximal. 

For example, values for K M and k cat may be determined in this invention by 
experiments in which the substrate concentration [S] is in excess over enzymatic DNA 

10 molecule concentration [El. Initial rates of reaction (v 0 ) over a range of substrate 

concentrations are estimated from the initial linear phase, generally the first 5% or less 
of the reaction. Data points are fit by a least squares method to a theoretical line given 
by the equation: v = -K M (v 0 /[S]) + V m8x . Thus, k C8t and K M are determined by the initial 
rate of reaction, v 0 , and the substrate concentration [S]. 

15 In various alternative embodiments, an enzymatic DNA molecule of the present 

invention has an enhanced or optimized ability to cleave nucleic acid substrates, 
preferably RNA substrates. In preferred embodiments, the enhanced or optimized ability 
of an enzymatic DNA molecule to cleave RNA substrates shows about a 10- to 10 9 -fold 
improvement over the uncatalyzed rate. In more preferred embodiments, an enzymatic 

20 DNA molecule of the present invention is able to cleave RNA substrates at a rate that is 

about 10 3 - to 10Mold improved over "progenitor" species. In even more preferred, 
embodiments, the enhanced or optimized ability to cleave RNA substrates is expressed 
as a 10 4 - to 10 6 -fold improvement over the progenitor species. One skilled in the art will 
appreciate that the enhanced or optimized ability of an enzymatic DNA molecule to 

25 cleave nucleic acid substrates may vary depending upon the selection constraints 

applied during the in vitro evolution procedure of the invention. 

Various preferred methods of modifying deoxyribozymes and other enzymatic 
DNA molecules and nucleases of the present invention are further described in Examples 
1-3 hereinbelow. 

30 C. Nucleotide Analogs 

As noted above, the term "nucleotide analog" as used herein generally refers to 
a purine or pyrimidine nucleotide that differs structurally from A, T, G, C, or U, but is 
sufficiently similar to substitute for such "normal" nucleotides in a nucleic acid molecule. 
As used herein, the term "nucleotide analog" encompasses altered bases, different (or 

35 unusual) sugars, altered phosphate backbones, or any combination of these alterations. 

Examples of nucleotide analogs useful according to the present invention include those 
listed in the following Table, most of which are found in the approved listing of modified 



S6170eaA1 _1A> 



WO 96/17086 



PCT/US95/15580 



-18- 

bases at 37 CFR §1.822 (which is incorporated herein by reference). 

Table 1 

Nucleotide Analogs 



Abbreviation 



Description 



10 



15 



20 



25 



30 



35 



ac4c 

chm5u 

cm 

cmnm5s2u 
d 

fm 
galq 
gm 
I 

i6a 

m1a 

m1f 

m1g 

mil 

m22g 

m2a 

m2g 

m3c 

m5c 

m6a 

m7g 

mam5u 

mam5s2u 

manq 

mcm5s2u 

mo5u 

ms2i6a 

ms2t6a 

mt6a 



4- acetylcytidine 

5- (carboxyhydroxylmethyl)uridine 
2 , -0-methylcytidine 

5-carboxymethylaminomethyl-2-thiouridine 

dihydrouridine 

2'-0-methylpseudouridine 

Q>, D-gatactosylqueosine 

2'-0-methylguanosine 

inosine 

N6-isopentenyladenosine 
1 -methyladenosine 
1 -methylpseudouridine 
1 -methylguanosine 
1 -methylinosine 
2,2-dimethylguanosine 
2-methyladenosine 

2- methylguanosine 

3- methylcytidine 
5-methylcytidine 
N6-methyladenosine 
7-methylguanosine 
5-methylaminomethyluridine 
5-methoxyaminomethyl-2-thiouridine 
S, D-mannosylmethyluridine 
5-methoxycarbonylmethyluridine 
5-methoxyuridine 

2-methylthio-N6-isopentenyladenosine 

N-{(9-S-D-ribofuranosyl-2-methylthiopurine-6- 

yl)carbamoyl)threonine 

N-((9-&-D-ribofuranosylpurine-6-yl)N-methyl- 
carbamoyDthreonine 



WO 96/17086 



PCT/US95/15580 



-19- 
(Table 1, cont'd) 



Abbreviation Description 

mv uridine-5-oxyacetic acid methylester 

o5u uridine-5-oxyacetic acid (v) 

osyw wybutoxosine 

p pseudouridine 

q queosine 

s2c 2-thiocytidine 

s2t 5-methyl-2-thiouridine 

s2u 2-thiouridine 

s4u 4-thiouridine 

t 5-methyluridine 

t6a N-({9-B-D-ribofuranosylpurine-6-yl)carbamoyl)threoninetm 

2 , -0-methyl-5-methyluridine 

um 2*-0-methyluridine 

yw wybutosine 

x 3-(3-amino-3-carboxypropyl)uridine, (acp3)u 

araU S, D-arabinosyl 

araT B, D-arabinosyl 



Other useful analogs include those described in published international 
application no. WO 92/20823 <the disclosures of which are incorporated herein by 
reference), or analogs made according to the methods disclosed therein. Analogs 
described in DeMesmaeker, et al., Anoew. Ch em. Int. Ed. Enol. 33: 226-229 (1994); 
DeMesmaeker, et al., Svnlett : 733-736 (Oct. 1993); Nielsen, et al., Science 254: 1497- 
1500 (1991); and Idziak, et al., Tetrahedron Letters 34: 5417-5420 (1993) are also 
useful according to the within-disclosed invention and said disclosures are incorporated 
by reference herein. 

D. Methods of Engineering Enzvmatir, DNA Molecules 
The present invention also contemplates methods of producing nucleic acid 
molecules having a predetermined activity. In one preferred embodiment, the nucleic 
acid molecule is an enzymatic DNA molecule. In another variation, the desired activity is 
a catalytic activity. 

In one embodiment, the present invention contemplates methods of synthesizing 
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enzymatic DNA molecules that may then be "engineered" to catalyze a specific or 
predetermined reaction. Methods of preparing enzymatic DNA molecules are described 
herein; see, e.g., Examples 1-3 hereinbelow. In other embodiments, an enzymatic DNA 
molecule of the present invention may be engineered to bind small molecules or iigands, 
5 such as adenosine triphosphate {ATP). (See, e.g., Sassanfar, et al., Nature 364 : 550- 

553 (1993).) 

In another embodiment, the present invention contemplates that a population of 
enzymatic DNA molecules may be subjected to mutagenizing conditions to produce a 
diverse population of mutant enzymatic DNA molecules (which may alternatively be 

10 called "deoxyribozymes" or "DNAzymes"). Thereafter, enzymatic DNA molecules having 

desired characteristics are selected and/or separated from the population and are 
subsequently amplified. 

Alternatively, mutations may be introduced in the enzymatic DNA molecule by 
altering the length of the recognition domains of the enzymatic DNA molecule. The 

1 5 recognition domains of the enzymatic DNA molecule associate with a complementary 

sequence of bases within a substrate nucleic acid sequence. Methods of altering the 
length of the recognition domains are known in the art and include PCR, for example; 
useful techniques are described further in the Examples below. 

Alteration of the length of the recognition domains of an enzymatic DNA 

20 molecule may have a desirable effect on the binding specificity of the enzymatic DNA 

molecule. For example, an increase in the length of the recognition domains may 
increase binding specificity between the enzymatic DNA molecule and the 
complementary base sequences of an oligonucleotide in a substrate, or may enhance 
recognition of a particular sequence in a hybrid substrate. In addition, an increase in the 

25 length of the recognition domains may also increase the affinity with which it binds to 

substrate. In various embodiments, these altered recognition domains in the enzymatic 
DNA molecule confer increased binding specificity and affinity between the enzymatic 
DNA molecule and its substrate. 

It has recently been noted that certain oligonucleotides are able to recognize and 

30 bind molecules other than oligonucleotides with complementary sequences. These 

oligonucleotides are often given the name "aptamers". For example, Ellington and 
Szostak describe RNA molecules that are able to bind a variety of organic dyes ( Nature 
346 : 818-822 (1990)), while Bock, et al. describe ssDNA molecules that bind human 
thrombin ( Nature 355 : 564-566 (1992)). Similarly, Jellinek, et al. describe RNA ligands 

35 to basic fibroblast growth factor ( PNAS USA 90 : 1 1 227-1 1 231 (1 993)). Thus, it is 

further contemplated herein that the catalytically active DNA enzymes of the present 
invention may be engineered according to the within-described methods to display a 
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10 



15 



20 



25 



30 



variety of capabilities typically associated with aptamers. 

One of skill in the art should thus appreciate that the enzymatic ONA molecules 
of this invention can be altered at any nucleotide sequence, such as the recognition 
domains, by various methods disclosed herein, including PGR and 3SR (self-sustained 
sequence replication -- see Example 1 below). For example, additional nucleotides can 
be added to the 5* end of the enzymatic DNA molecule by including additional 
nucleotides in the primers. 

Enzymatic DNA molecules of the present invention may also be prepared or 
engineered in a more non-random fashion via use of methods such as site-directed 
mutagenesis. For example, site-directed mutagenesis may be carried out essentially as 
described in Mohnaga, et al., Biotechnoloov 2 : 636 (1984), modified as described 
herein, for application to deoxyribozymes. Useful methods of engineering enzymatic 
DNA molecules are further described in the Examples below. 

In one disclosed embodiment, an enzymatic DNA molecule of the present t 
invention comprises a conserved core flanked by two substrate binding (or recognition) 
domains or sequences that interact with the substrate through base-pairing interactions. 
In various embodiments, the conserved core comprises one or more conserved domains 
or sequences. In another variation, an enzymatic DNA molecule further comprises a 
-spacer- region (or sequence) between the regions (or sequences) involved in base 
pairing. In still another variation, the conserved core is "interrupted " at various intervals 
by one or more less-conserved variable or "spacer" nucleotides. 

In various embodiments, the population of enzymatic DNA molecules is made up 
of at least 2 different types of deoxyribozyme molecules. For example, in one variation, 
the molecules have differing sequences. In another variation, the deoxyribozymes are 
nucleic acid molecules having a nucleic acid sequence defining a recognition domain that 
is contiguous or adjacent to the 5'-terminus of the nucleotide sequence. In various 
alternative embodiments, enzymatic DNA molecules of the present invention may further 
comprise one or more spacer regions located 3'-terrninal to the recognition domains, one 
or more loops located 3'-terminal to the recognition domains and/or spacer regions. In 
other variations, a deoxyribozyme of the present invention may comprise one or more 
regions which are capable of hybridizing to other regions of the same molecule. Other 
characteristics of enzymatic DNA molecules produced according to the presently- 
disclosed methods are described elsewhere herein. 

In other embodiments, mutagenizing conditions include conditions that introduce 
either defined or random nucleotide substitutions within an enzymatic DNA molecule. 
Examples of typical mutagenizing conditions include conditions disclosed in other parts 
of this specification and the methods described by Joyce et al., Nucl. Acids Res. 1 7: 
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71 1-722 (1989); Joyce, Gene 82 : 83-87(1989); and Beaudry and Joyce, Science 257 : 
635-41 (1992). 

In still other embodiments, a diverse population of mutant enzymatic nucleic acid 
molecules of the present invention is one that contains at least 2 nucleic acid molecules 
that do not have the exact same nucleotide sequence. In other variations, from such a 
diverse population, an enzymatic DNA molecule or other enzymatic nucleic acid having a 
predetermined activity is then selected on the basis of its ability to perform the 
predetermined activity. In various embodiments, the predetermined activity comprises, 
without limitation, enhanced catalytic activity, decreased K M , enhanced substrate 
binding ability, altered substrate specificity, and the like. 

Other parameters which may be considered aspects of enzyme performance 
include catalytic activity or capacity, substrate binding ability, enzyme turnover rate, 
enzyme sensitivity to feedback mechanisms, and the like. In certain aspects, substrate 
specificity may be considered an aspect of enzyme performance, particularly in 
situations in which an enzyme is able to recognize and bind two or more competing 
substrates, each of which affects the enzyme's performance with respect to the other 
substrate(s). 

Substrate specificity, as used herein, may refer to the specificity of an enzymatic 
nucleic acid molecule as described herein for a particular substrate, such as one 
comprising ribonucleotides only, deoxyribonucleotides only, or a composite of both. 
Substrate molecules may also contain nucleotide analogs. In various embodiments, an 
enzymatic nucleic acid molecule of the present invention may preferentially bind to a 
particular region of a hybrid or non-hybrid substrate. 

The term or parameter identified herein as "substrate specificity" may also 
include sequence specificity; i.e., an enzymatic nucleic acid molecule of the present 
invention may "recognize" and bind to a nucleic acid substrate having a particular 
nucleic acid sequence. For example, if the substrate recognition domains of an 
enzymatic nucleic acid molecule of the present invention will only bind to substrate 
molecules having a series of one or two ribonucleotides (e.g., rA) in a row, then the 
enzymatic nucleic acid molecule will tend not to recognize or bind nucleic acid substrate 
molecules lacking such a sequence. 

With regard to the selection process, in various embodiments, selecting includes 
any means of physically separating the mutant enzymatic nucleic acids having a 
predetermined activity from the diverse population of mutant enzymatic nucleic acids. 
Often, selecting comprises separation by size, by the presence of a catalytic activity, or 
by hybridizing the mutant nucleic acid to another nucleic acid, to a peptide, or some 
other molecule that is either in solution or attached to a solid matrix. 
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In various embodiments, the predetermined activity is such that the mutant 
enzymatic nucleic acid having the predetermined activity becomes labeled in some 
fashion by virtue of the activity. For example, the predetermined activity may be an 
enzymatic DNA molecule activity whereby the activity of the mutant enzymatic nucleic 
acid upon its substrate causes the mutant enzymatic nucleic acid to become covalently 
linked to it. The mutant enzymatic nucleic acid is then selected by virtue of the 
covalent linkage. 

In other embodiments, selecting a mutant enzymatic nucleic acid having a 
predetermined activity includes amplification of the mutant enzymatic nucleic acid (see, 
e.g., Joyce, Gene 82 : 83-87 (1989); Beaudry and Joyce, Science 257 : 635-41 (1992)). 
Other methods of selecting an enzymatic nucleic acid molecule having a predetermined 
characteristic or activity are described in the Examples section. 

E. Compositions 

The invention also contemplates compositions containing one or more types^or 
populations of enzymatic DNA molecules of the present invention; e.g., different types 
or populations may recognize and cleave different nucleotide sequences. Compositions 
may further include a ribonucleic acid-containing substrate. Compositions according to 
the present invention may further comprise lead ion, magnesium ion, or other divalent or 
monovalent cations, as discussed herein. 

Preferably, the enzymatic DNA molecule is present at a concentration of about 
0.05 jt/M to about 2 //M. Typically, the enzymatic DNA molecule is present at a 
concentration ratio of enzymatic DNA molecule to substrate of from about 1 :5 to about 
1:50. More preferably, the enzymatic DNA molecule is present in the composition at a 
concentration of about 0.1 //M to about 1 //M. Even more preferably, compositions 
contain the enzymatic DNA molecule at a concentration of about 0.1 //M to about 0.5 
/>M. Preferably, the substrate is present in the composition at a concentration of about 
0.5 fjhA to about 1000//M. 

One skilled in the art will understand that there are many sources of nucleic 
acid-containing substrates including naturally-occurring and synthetic sources. Sources 
of suitable substrates include, without limitation, a variety of viral and retroviral agents, 
including HIV-1, HIV-2, HTLV-I, and HTLV-II. 

Other suitable substrates include, without limitation, viral and retroviral agents 
including those comprising or produced by picornaviruses, hepadnaviridae (e.g., HBV, 
HCV), papillomaviruses (e.g., HPV), gammaherpesvirinae (e.g., EBV), 
lymphocryptoviruses, leukemia viruses (e.g., HTLV-I and -II), flaviviruses, togaviruses, 
herpesviruses (including alphaherpesviruses and betaherpesviruses), cytomegaloviruses 
(CMV), influenza viruses, and viruses and retroviruses contributing to immunodeficiency 
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diseases and syndromes (e.g., HIV-1 and -2). in addition, suitable substrates include 
viral and retroviral agents which infect non-human primates and other animals including, 
without limitation, the simian and feline immunodeficiency viruses and bovine leukemia 
viruses. 

Magnesium ion, lead ion, or another suitable monovalent or divalent cation, as 
described previously, may also be present in the composition, at a concentration ranging 
from about 1-100 mM. More preferably, the preselected ion is present in the 
composition at a concentration of about 2 mM to about 50 mM, with a concentration of 
about 5 mM being particularly preferred. One skilled in the art will understand that the 
ion concentration is only constrained by the limits of solubility of its source (e.g. 
magnesium) in aqueous solution and a desire to have the enzymatic DNA molecule 
present in the same composition in an active conformation. 

The invention also contemplates compositions containing an enzymatic DNA 
molecule of the present invention, hybrid deoxyribonucleotide-ribonucleotide molecules, 
and magnesium or lead ion in concentrations as described hereinabove. As noted 
previously, other monovalent or divalent ions (e.g., Ca 2 + ) may be used in place of 
magnesium. 

Also contemplated by the present invention are compositions containing an 
enzymatic DNA molecule of the present invention, nucleic acid-containing substrate (e.g. 
RNA), and a preselected ion at a concentration of greater than about 1 millimolar, 
wherein said substrate is greater in length than the recognition domains present on the 
enzymatic DNA molecule. 

In one variation, a composition comprises an enzymatic DNA molecule-substrate 
complex, wherein base pairing between an enzymatic DNA molecule and its substrate is 
contiguous. In another embodiment, base pairing between an enzymatic DNA molecule 
and its substrate is interrupted by one or more noncomplementary pairs. In a variety of 
alternative embodiments, a composition of the present invention may further comprise a 
monovalent cation, a divalent cation, or both. 

In another variation, an enzymatic DNA molecule of the present invention is 
capable of functioning efficiently in the presence or absence of a divalent cation. In one 
variation, a divalent cation is present and comprises Pb 2 + , Mg 2 + , Mn 2+ , Zn 2 + , or Ca 2+ . 
Alternatively, an enzymatic DNA molecule of the present invention is capable of 
functioning efficiently in the presence or absence of monovalent cations. It is 
anticipated that monovalent or divalent cation concentrations similar to those described 
herein for Pb 2 + or Mg 2+ will be useful as disclosed herein. 

Optionally, monovalent cations may also be present in addition to, or as 
"alternatives'* for, divalent cations. For example, monovalent cations such as sodium 




PCT/US95/15580 



WO 96/17086 



-25- 



10 



15 



20 



25 



30 



(Na + ) or potassium (K + ) may be present, either as dissociated ions or in the form of 
dissociable compounds such as NaCI or KCI. 

In one embodiment, the concentration of monovalent cation present in the 
composition ranges from 0 - 1.0 M. In another embodiment, a monovalent cation is 
present in a concentration ranging from about 0-200 mM. In other embodiments, 
monovalent cations are present in a concentration ranging from about 1-100 mM. 
Alternatively, the concentration of monovalent cations ranges from about 2 mM - 50 
mM. In still other embodiments, the concentration ranges from about 2 mM - 25 mM. 

F. Methods of Usino Enzvmatic DNA Molecules 

The methods of using enzymatic DNA molecules as disclosed herein are legion. 
As discussed previously, molecules capable of cleaving the bonds linking neighboring 
nucleic acids {e.g., phosphoester bonds) have numerous uses encompassing a wide 
variety of applications. For example, enzymatic DNA molecules having the within- 
disclosed capabilities, structures, and/or functions are useful in pharmaceutical and 
medical products (e.g., for wound debridement, clot dissolution, etc.), as well as in : 
household items (e.g., detergents, dental hygiene products, meat tenderizers). Industrial 
utility of the within-disclosed compounds, compositions and methods is also 
contemplated and well within the scope of the present invention. 

The present invention also describes useful methods for cleaving any single- 
stranded, looped, partially or fully double-stranded nucleic acid; the majority of these 
methods employ the novel enzymatically active nucleic acid molecules of the present 
invention. In various embodiments, the single-stranded nucleic acid segment or portion 
of the substrate (or the entire substrate itself) comprises DNA, modified DNA, RNA, 
modified RNA, or composites thereof. Preferably, the nucleic acid substrate need only 
be single-stranded at or near the substrate cleavage sequence so that an enzymatic 
nucleic acid molecule of the present invention can hybridize to the substrate cleavage 
sequence by virtue of the enzyme's recognition sequence. 

A nucleic acid substrate that can be cleaved by a method of this invention may 
be chemically synthesized or enzymatically produced, or it may be isolated from various 
sources such as phages, viruses, prokaryotic cells, or eukaryotic cells, including animal 
cells, plant cells, yeast cells and bacterial cells. Chemically synthesized single- and 
double-stranded nucleic acids are commercially available from many sources including, 
without limitation, Research Genetics (Huntsville, AL). 

RNA substrates may also be synthesized using an Applied Biosystems {Foster 
City, CA) oligonucleotide synthesizer according to the manufacturer's instructions. 
Single-stranded phage are also a source of nucleic acid substrates. (See, e.g., Messing 
et al., pNAS USA 74 : 3642-3646 (1977), and Yanisch-Perron et al., Gene 33: 103-1 19 
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(1985).) Bacterial cells containing single-stranded phage would also be a ready source 
of suitable single-stranded nucleic acid substrates. 

Single-stranded RNA cleavable by a method of the present invention could be 
provided by any of the RNA viruses such as the picornaviruses, togaviruses, 
5 orthomyxoviruses, paramyxoviruses, rhabdoviruses, coronaviruses, arenaviruses or 

retroviruses. As noted previously, a wide variety of prokaryotic and eukaryotic cells 
may also be excellent sources of suitable nucleic acid substrates. 

The methods of this invention may be used on single-stranded nucleic acids or 
single-stranded portions of looped or double-stranded nucleic acids that are present 

10 inside a cell, including eukaryotic, procaryotic, plant, animal, yeast or bacterial ceils. 

Under these conditions an enzymatic nucleic acid molecule (e.g., an enzymatic DNA 
molecule or deoxyribozyme) of the present invention could act as an anti-viral agent or a 
regulator of gene expression. Examples of such uses of enzymatic DNA molecules of 
the present invention are described further hereinbelow. 

15 In the majority of methods of the present invention, cleavage of single-stranded 

nucleic acids occurs at the 3'-terminus of a predetermined base sequence. This 
predetermined base sequence or substrate cleavage sequence typically contains from 1 
to about 10 nucleotides. In other preferred embodiments, an enzymatic DNA molecule 
of the present invention is able to recognize nucleotides either upstream, or upstream 

20 and downstream of the cleavage site. In various embodiments, an enzymatic DNA 

molecule is able to recognize about 2-10 nucleotides upstream of the cleavage site; in 
other embodiments, an enzymatic DNA molecule is able to recognize about 2-10 
nucleotides upstream and about 2-10 nucleotides downstream of the cleavage site. 
Other preferred embodiments contemplate an enzymatic DNA molecule that is capable 

25 of recognizing a nucleotide sequence up to about 30 nucleotides in length, with a length 

up to about 20 nucleotides being even more preferred. 

The within-disclosed methods allow cleavage at any nucleotide sequence by 
altering the nucleotide sequence of the recognition domains of the enzymatic DNA 
molecule. This allows cleavage of single-stranded nucleic acid in the absence of a 

30 restriction endonuclease site at the selected position. 

An enzymatic DNA molecule of the present invention may be separated from any 
portion of the single-stranded nucleic acid substrate that remains attached to the 
enzymatic DNA molecule by site-specific hydrolysis at the appropriate cleavage site. 
Separation of the enzymatic DNA molecule from the substrate (or "cleavage product") 

35 allows the enzymatic DNA molecule to carry out another cleavage reaction. 

Generally, the nucleic acid substrate is treated under appropriate nucleic acid 
cleaving conditions preferably, physiologic conditions — with an effective amount of 
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an enzymatic DNA molecule of the present invention. If the nucleic acid substrate 
comprises DNA, cleaving conditions may include the presence of a divalent cation at a 
concentration of about 2-1 OmM. 

An effective amount of an enzymatic DNA molecule is the amount required to 
cleave a predetermined base sequence present within the single-stranded nucleic acid. 
Preferably, the enzymatic DNA molecule is present at a molar ratio of DNA molecule to 
substrate cleavage sites of 1 to 20. This ratio may vary depending on the length of 
treating and efficiency of the particular enzymatic DNA molecule under the particular 
nucleic acid cleavage conditions employed* 

Thus, in one preferred embodiment, treating typically involves admixing, in 
aqueous solution, the RNA-containing substrate and the enzyme to form a cleavage 
admixture, and then maintaining the admixture thus formed under RNA cleaving 
conditions for a time period sufficient for the enzymatic DNA molecule to cleave the 
RNA substrate at any of the predetermined nucleotide sequences present in the RNA. In 
various embodiments, a source of ions is also provided -- i.e. monovalent or divalent 
cations, or both. 

In one embodiment of the present invention, the amount of time necessary for 
the enzymatic DNA molecule to cleave the single-stranded nucleic acid has been 
predetermined. The amount of time is from about 1 minute to about 24 hours and will 
vary depending upon the concentration of the reactants and the temperature of the 
reaction. Usually, this time period is from about 1 0 minutes to about 2 hours such-that 
the enzymatic DNA molecule cleaves the single-stranded nucleic acid at any of the 
predetermined nucleotide sequences present. 

The invention further contemplates that the nucleic acid cleaving conditions 
include the presence of a source of divalent cations (e.g., PbOAc) at a concentration of 
about 2-100 mM. Typically, the nucleic acid cleaving conditions include divalent cation 
at a concentration of about 2 mM to about 10 mM, with a concentration of about 5 mM 
being particularly preferred. 

The optimal cationic concentration to include in the nucleic acid cleaving 
conditions can be easily determined by determining the amount of single-stranded 
nucleic acid cleaved at a given cation concentration. One skilled in the art will 
understand that the optimal concentration may vary depending on the particular 
enzymatic DNA molecule employed. 

The present invention further contemplates that the nucleic acid cleaving 
conditions include a pH of about pH 6.0 to about pH 9.0. In one preferred embodiment, 
the pH ranges from about pH 6.5 to pH 8.0. In another preferred embodiment, the pH 
emulates physiological conditions, i.e., the pH is about 7.0-7.8, with a pH of about 7.5 
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being particularly preferred. 

One skilled in the art will appreciate that the methods of the present invention 
will work over a wide pH range so long as the pH used for nucleic acid cleaving is such 
that the enzymatic DNA molecule is able to remain in an active conformation. An 
5 enzymatic DNA molecule in an active conformation is easily detected by its ability to 

cleave single-stranded nucleic acid at a predetermined nucleotide sequence. 

In various embodiments, the nucleic acid cleaving conditions also include a 
variety of temperature ranges. As noted previously, temperature ranges consistent with 
physiological conditions are especially preferred, although temperature ranges consistent 

10 with industrial applications are also contemplated herein. In one embodiment, the 

temperature ranges from about 15°C to about 60°C. In another variation, the nucleic 
acid cleaving conditions include a temperature ranging from about 30°C to about 56 °C. 
In yet another variation, nucleic acid cleavage conditions include a temperature from 
about 35°C to about 50°C. In a preferred embodiment, nucleic acid cleavage conditions 

1 5 comprise a temperature range of about 37°C to about 42°C. The temperature ranges 

consistent with nucleic acid cleaving conditions are constrained only by the desired 
cleavage rate and the stability of that particular enzymatic DNA molecule at that 
particular temperature. 

In various methods, the present invention contemplates nucleic acid cleaving 

20 conditions including the presence of a polyamine. Polyamines useful for practicing the 

present invention include spermidine, putrescine, spermine and the like. In one 
variation, the polyamine is present at a concentration of about .01 mM to about 10 mM. 
In another variation, the polyamine is present at a concentration of about 1 mM to about 
10 mM. Nucleic acid cleavage conditions may also include the presence of polyamine at 

25 a concentration of about 2 mM to about 5 mM. In various preferred embodiments, the 

polyamine is spermidine. 
G. Vectors 

The present invention also features expression vectors including a nucleic acid 
segment encoding an enzymatic DNA molecule of the present invention situated within 
30 the vector, preferably in a manner which allows expression of that enzymatic DNA 

molecule within a target cell (e.g., a plant or animal cell). 

Thus, in general, a vector according to the present invention preferably includes 
a plasmid, cosmid, phagemid, virus, or phage vector. Preferably, suitable vectors 
comprise single-stranded DNA (ssDNA) - e.g., circular phagemid ssDNA. It should also 
35 be appreciated that useful vectors according to the present invention need not be 

circular. 

In one variation, nucleotide sequences flanking each of the additional enzymatic 
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ONA molecule-encoding sequences are preferably provided, which sequences may be 
recognized by the first enzymatic DNA molecule. The intervening or flanking sequences 
preferably comprise at least 1 nucleotide; more preferably, intervening or flanking 
sequences are about 2-20 nucleotides in length, with sequences of about 5-10 
5 nucleotides in length being particularly preferred. 

The addition of polynucleotide tails may also be useful to protect the 3* end of 
an enzymatic DNA molecule according to the present invention. These may be provided 
by attaching a polymeric sequence by employing the enzyme terminal transferase. 

A vector according to the present invention includes two or more enzymatic 

10 DNA molecules. In one embodiment, a first enzymatic DNA molecule has intramolecular 

cleaving activity and is able to recognize and cleave nucleotide sequences to release 
other enzymatic DNA sequences; i.e., it is able to function to "release" other enzymatic 
DNA molecules from the vector. For example, a vector is preferably constructed so that 
when the first enzymatic DNA molecule is expressed, that first molecule is able to r 

15 cleave nucleotide sequences flanking additional nucleotide sequences encoding a second 

enzymatic DNA molecule, a third enzymatic DNA molecule, and so forth. Presuming 
said first enzymatic DNA molecule (i.e., the "releasing" molecule) is able to cleave 
oligonucleotide sequences intramolecularly, the additional (e.g. second, third, and so on) 
enzymatic DNA molecules (i.e., the "released" molecules) need not possess 

20 characteristics identical to the "releasing" molecule. For example, in one embodiment, 

the "released" (i.e., the second, third, etc.) enzymatic DNA molecules are able to cleave 
specific RNA sequences, while the first ("releasing") enzymatic DNA molecule has 
nuclease activity allowing it to liberate the "released" molecules. In another 
embodiment, the "released" enzymatic DNA molecule has amide bond-cleaving activity, 

25 while the first ("releasing") enzymatic DNA molecule has nuclease activity. 

Alternatively, the first enzymatic DNA molecule may be encoded on a separate 
vector from the second (and third, fourth, etc.) enzymatic DNA molecule(s) and may 
have intermolecular cleaving activity. As noted herein, the first enzymatic DNA 
molecule can be a self-cleaving enzymatic DNA molecule (e.g., a deoxyribozyme), and 

30 the second enzymatic DNA molecule may be any desired type of enzymatic DNA 

molecule. When a vector is caused to express DNA from these nucleic acid sequences, 
that DNA has the ability under appropriate conditions to cleave each of the flanking 
regions, thereby releasing one or more copies of the second enzymatic DNA molecule. 
If desired, several different second enzymatic DNA molecules can be placed in the same 

35 cell or carrier to produce different deoxyribozymes. It is also contemplated that any one 

or more vectors may comprise one or more ribozymes or deoxyribozymes in any 
combination of "releasing" and "released" enzymatic nucleic acid molecules, as long as 
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such a combination achieves the desired result: the release of enzymatic nucleic acid 
molecules that are capable of cleaving predetermined nucleic acid sequences. 

Methods of isolating and purifying enzymatic DNA molecules of the present 
invention are also contemplated. In addition to the methods described herein, various 
5 purification methods (e.g. those using HPLC) and chromatographic isolation techniques 

are available in the art. See, e.g., the methods described in published international 
application no. WO 93/23569, the disclosures of which are incorporated herein by 
reference. 

It should also be understood that various combinations of the embodiments 
10 described herein are included within the scope of the present invention. Other features 

and advantages of the present invention will be apparent from the descriptions 
hereinabove, from the Examples to follow, and from the claims. 



EXAMPLES 

1 5 The following examples illustrate, but do not limit, the present invention. 

Example 1 

In VitrQ Evolution of Enzymatic PNA Molecules; 

An Overview 

In vitro selection and in vitro evolution techniques allow new catalysts to be 

20 isolated without a priori knowledge of their composition or structure. Such methods 

have been used to obtain RNA enzymes with novel catalytic properties. For example, 
ribozymes that undergo autolytic cleavage with lead cation have been derived from a 
randomized pool of tRNA^* molecules (Pan and Uhlenbeck, Biochemistry 31 : 3887-3895 
(1992)). Group I ribozyme variants have been isolated that can cleave DNA (Beaudry 

25 and Joyce, Science 257 : 635-641 (1992)) or that have altered metal dependence 

(Lehman and Joyce, Nature 361 : 182-185 (1993)). Starting with a pool of random RNA 
sequences, molecules have been obtained that catalyze a polymerase-like reaction 
(Bartel and Szostak, Science 261 : 1411-1418 (1993)). In the present example, 
refinement of specific catalytic properties of an evolved enzyme via alteration of the 

30 selection constraints during an in vitro evolution procedure is described. 

Darwinian evolution requires the repeated operation of three processes: (a) 
introduction of genetic variation; (b) selection of individuals on the basis of some fitness 
criterion; and (c) amplification of the selected individuals. Each of these processes can 
be realized in vitro (Joyce, Gene 82 : 83 (1989)). A gene can be mutagenized by 

35 chemical modification, incorporation of randomized mutagenic oligodeoxynucleotides, or 

inaccurate copying by a polymerase. (See, e.g., Cadwell and Joyce, in PCR Methods 
and Applications 2 : 28-33 (1992); Cadwell and Joyce, PCR Methods and Applications 3 
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ISiiCaLi: S136-S140 (1994); Chu, et al. f Viroloov 98 : 168 (1979); Shortle, et al., Meth. 
Enzvmol. 100 : 457 (1983); Myers, et al.. Science 229 : 242 (1985); Matteucci, et al.. 
Nucleic Acids Res. 1 1 : 3113 (1983); Wells, et al.. Gene 34 : 315 (1985); McNeil, et al., 
Mol. Cell. Biol. 5 : 3545 (1985); Hutchison, et al., PNAS USA 83 : 710 (1986); 
5 Derbyshire, et al., Gene 46 : 145 (1986); Zakour, et al.. Nature 295 : 708 (1982); 

Lehtovaara, et al., Protein Eng. 2 : 63 (1988); Leung, et al.. Technique 1 : 1 1 (1989); 
Zhou, et al., Nucl. Acids Res. 19 : 6052 (1991).) 

The gene product can be selected, for example, by its ability to bind a ligand or 
to carry out a chemical reaction. (See, e.g., Joyce, lc^ (1989); Robertson and Joyce, 

10 Nature 344 : 467 (1990); Tuerk, et al.. Science 249 : 505 (1990).) The gene that 

corresponds to the selected gene product can be amplified by a reciprocal primer 
method, such as the polymerase chain reaction (PCR). (See, e.g., Saiki, et al., Science 
230 : 1350-54 (1985); Saiki, et al., Science 239 : 487-491 (1988).) 

Alternatively, nucleic acid amplification may be carried out using self-sustained 

15 sequence replication (3SR). (See, e.g., Guatelli, et al., PNAS USA 87 : 1874 (1990), the 

disclosures of which are incorporated by reference herein.) According to the 3SR 
method, target nucleic acid sequences may be amplified (replicated) exponentially in 
vitro under isothermal conditions by using three enzymatic activities essential to 
retroviral replication: (1) reverse transcriptase, (2) RNase H, and (3) a DNA-dependent 

20 RNA polymerase. By mimicking the retroviral strategy of RNA replication by means of 

cDNA intermediates, this reaction accumulates cDNA and RNA copies of the original 
target. 

In summary, if one is contemplating the evolution of a population of enzymatic 
DNA molecules, a continuous series of reverse transcription and transcription reactions 

25 replicates an RNA target sequence by means of cDNA intermediates. The crucial 

elements of this design are (a) the oligonucleotide primers both specify the target and 
contain 5* extensions encoding the T7 RNA polymerase binding site, so that the 
resultant cDNAs are competent transcription templates; (b) cDNA synthesis can proceed 
to completion of both strands due to the degradation of template RNA in the 

30 intermediate RNA-DNA hybrid by RNase H; and (c) the reaction products (cDNA and 

RNA) can function as templates for subsequent steps, enabling exponential replication. 

If one is evolving enzymatic DNA molecules, various critical elements of this 
design are somewhat different, as disclosed in these Examples. For instance, (1) the 
oligonucleotide primers specify the target and are preferably "marked" or labeled in 

35 some fashion - e.g., via biotinylation so the resultant competent template strands are 

easily identified; and (2) the in vitro selection procedure used preferably depends upon 
the identification of the most favorable release mechanism. 




WO 96/17086 



PCT/US95/15580 



-32- 



10 



15 



20 



25 



30 



A major obstacle to realizing Darwinian evolution in vitro is the need to integrate 
mutation and amplification, both of which are genotype-related, with selection, which is 
phenotype-related. In the case of nucleic acid enzymes, for which genotype and 
phenotype are embodied in the same molecule, the task is simplified. 



It is well known that single-stranded DNA can assume interesting tertiary 
structures. The structure of a *tDNA\ for example, closely resembles that of the 
corresponding tRNA. (See Paquette, et al., Eur. J. Biochem. 189 : 259-265 (1990).) 
Furthermore, it has been possible to replace as many as 31 of 35 ribonucleotides within 
a hammerhead ribozyme, while retaining at least some catalytic activity. (See Perreault, 
et al., Nature 344: 565-567 (1990); Williams, et al., Proc. Natl. Acad. Sci. USA 89 : 
918-921 11992); Yang, et al., Biochemistry 31 : 5005-5009 (1992).) 

In vitro selection techniques have been applied to large populations of 
random-sequence DNAs, leading to the recovery of specific DNA "aptamers" that bind a 
target ligand with high affinity (Bock, et al., Nature 355 : 564-566 (1992); Ellington & 
Szostak, Nature 355: 850-852 (1992); Wyatt & Ecker, PNAS USA 91 : 1356-1360 
(1994)). Recently, two groups carried out the first NMR structural determination of an 
aptamer, a 15mer DNA that forms a G-quartet structure and binds the protein thrombin 
with high affinity (Wang, et al., Biochemistry 32 : 1899-1904 (1993); Macaya, et al., 
PNAS USA 90: 3745-3749 (1993)). These findings were corroborated by an X-ray 
crystallographic analysis (Padmanabhan, et al., J. Biol. Chem. 268 : 17651-17654 
(1993)). 

The ability to bind a substrate molecule with high affinity and specificity is a 
prerequisite of a good enzyme. In addition, an enzyme must make use of 
well-positioned functional groups, either within itself or a cofactor, to promote a 
particular chemical transformation. Furthermore, the enzyme must remain unchanged 
over the course of the reaction and be capable of operating with catalytic turnover. 
Some would add the requirement that it be an informational macromolecule, comprised 
of subunits whose specific ordering is responsible for catalytic activity. While these 
criteria are open to debate on both semantic and chemical grounds, they serve to 
distinguish phenomena of chemical rate enhancement that range from simple solvent 
effects to biological enzymes operating at the limit of substrate diffusion (Albery & 
Knowles, Biochemistry 15 : 5631-5640 (1976)). 

As described in greater detail hereinbelow, we sought to develop a general 
method for rapidly obtaining DNA catalysts and DNA enzymes, starting from random 
sequences. As an initial target, we chose a reaction that we felt was well within the 
capability of DNA: the hydrolytic cleavage of an RNA phosphodiester, assisted by a 
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divalent metal cofactor. This is the same reaction that is carried out by a variety of 
naturally-occurring RNA enzymes, including the hammerhead and hairpin motifs. (See, 
e.g., Forster A.C. & Symons R.H., Cell 49 : 211-220 (1987); Uhlenbeck, Nature 328 : 
596-600 (1987); Hampel & Tritz, Biochemistry 28 : 4929-4933 (1989)). 

It has recently been shown that, beginning with a randomized library of tRNA 
molecules, one can obtain ribozymes that have Pb 2+ -dependent, site-specific RNA 
phosphoesterase activity at neutral pH (Pan & Uhlenbeck, Biochemistry 31 : 3887-3895 
(1992); Pan & Uhlenbeck, Nature 358 : 560-563 (1992)). This is analogous to the 
fortuitous self-cleavage reaction of yeast tRNA**" (Dirheimer & Werner, Biochimie 54: 
127-144 (1972)), which depends on specific coordination of a Pb 2+ ion at a defined site 
within the tRNA. (See Rubin & Sundaralingam, J. Biomol. Struct. Dvn. 1: 639-646 
(1983); Brown, et a!., Biochemistry 24 : 4785-4801 (1985).) 

As disclosed herein, our goals included the development of DNAs that could 
carry out Pb 2+ -dependent cleavage of a particular RNA phosphoester, initially presented 
within a short leader sequence attached to the 5' end of the DNA, and ultimately 
located within a separate molecule that could be cleaved in an intermolecular fashion 
with rapid catalytic turnover. These goals were successfully achieved, as described 
further below. 

No assumptions were made as to how the DNA would interact with the target 
phosphoester and surrounding nucleotides. Beginning with a pool of approximately 10 14 
random 50mer sequences, in vitro selection was allowed to run its course. After five 
rounds of selection carried out over four days, the population as a whole had attained 
the ability to cleave the target phosphoester in the presence of 1 mM Pb 2 + at a rate of 
about 0.2 min 1 . This is an approximately 10 5 -fold increase compared to the 
spontaneous rate of cleavage under the same reaction conditions. 

Individuals were isolated from the population, sequenced, and assayed for 
catalytic activity. Based on this information, the reaction was converted to an 
intermolecular format and then simplified to allow site-specific cleavage of a 19mer 
substrate by a 38mer DNA enzyme, in a reaction that proceeds with a turnover rate of 1 
min 1 at 23°C and pH 7.0 in the presence of 1 mM PbOAc. 
B. In Vitro Sel ection Scheme 

A starting pool of approximately 10 14 single-stranded DNA molecules was 
generated, all of which contain a 5' biotin moiety, followed successively by a fixed 
domain that includes a single ribonucleotide, a potential catalytic domain comprised of 
50 random deoxyribonucleotides, and a second fixed domain that lay at the 3' terminus 
(Fig. 1). 

The pool was constructed by a nested PGR (polymerase chain reaction) 
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technique, beginning with synthetic DNA that contained 50 random nucleotides flanked 
by primer binding sites. The nested PGR primer was a 5*-biotinylated synthetic 
oligodeoxynucleotide with a 3*-terminal adenosine ribonucleotide. 

Ribonucleotide-terminated oligonucleotides efficiently prime template-directed elongation 
in the context of the PCR (L.E. Orgel, personal communication), in this case giving rise 
to an extension product that contains a single embedded ribonucleotide. 

Figure 1 illustrates a selective amplification scheme for isolation of DNAs that 
cleave a target RNA phosphoester. Double-stranded DNA containing a stretch of 50 
random nucleotides is amplified via PCR, employing a 5*-biotinylated DNA primer (e.g., 
primer 3 - 3a or 3b) terminated at the 3' end by an adenosine ribonucleotide 
(represented by the symbol "N" or n rA", wherein both N and rA represent an adenosine 
ribonucleotide). This primer is extended by Taq polymerase to yield a DNA product that 
contains a single embedded ribonucleotide. The resulting double-stranded DNA is 
immobilized on a streptavidin matrix and the unbiotinylated DNA strand is removed by 
washing with 0.2 N NaOH. After re-equilibrating the column with a buffered solution, 
the column is washed with the same solution with added 1 mM PbOAc. DNAs that 
undergo Pb 2 + -dependent self-cleavage are released from the column, collected in the 
eluant, and amplified by PCR. The PCR products are then used to initiate the next round 
of selective amplification. 

The PCR products were passed over a streptavidin affinity matrix, resulting in 
noncovalent attachment of the 5'-biotinylated strand of the duplex DNA. The 
nonbiotinylated strand was removed by brief washing with 0.2 N NaOH, and the bound 
strand was equilibrated in a buffer containing 0.5 M NaCI, 0.5 M KG, 50 mM MgCI 2 , 
and 50 mM HEPES (pH 7.0) at 23°C. Next, 1 mM PbOAc was provided in the same 
buffer, allowing Pb 2 * -dependent cleavage to occur at the target phosphoester, thereby 
releasing a subset of the DNAs from the streptavidin matrix. In principle, an individual 
DNA might facilitate its own release by various means, such as disruption of the 
interaction between biotin and streptavidin or cleavage of one of the 
deoxyribonucleotide linkages. It was felt that cleavage of the ribonucleoside 3 -O P 
bond would be the most likely mechanism for release, based on the relative lability of 
this linkage, and that Pb 2 + -dependent hydrolytic cleavage would allow release to occur 
most rapidly. In principle, however, the in vitro selection procedure should identify the 
most favorable release mechanism as well as those individuals best able to carry out 
that mechanism. 

DNA molecules released from the matrix upon addition of Pb 24 were collected in 
the eluant, concentrated by precipitation with ethanol, and subjected to nested PCR 
amplification. As in the construction of the starting pool of molecules, the first PCR 
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amplification utilized primers that flank the random region (primers 1 and 2) and the 
second utilized a 5'-biotinylated primer (primer 3b) that has a 3 '-terminal riboadenylate, 
thereby reintroducing the target RNA phosphoester. The entire selective amplification 
procedure requires 3-4 hours to perform. 

The molecules are purified in three ways during each round of this procedure: 
first, following PGR amplification, by extracting twice with phenol and once with 
chloroform / isoamyl alcohol, then precipitating with ethanol; second, following 
attachment of the DNA to streptavidin, by washing away all the nonbiotinylated 
molecules under strongly denaturing conditions; and third, following elution with Pb 2+ , 
by precipitating with ethanol. There is no gel electrophoresis purification step, and thus 
no selection pressure constraining the molecules to a particular length. 

c. Selection of Catalytic PNA 

We carried out five successive rounds of in vitro selection, progressively 
decreasing the reaction time following addition of Pb 2+ in order to progressively increase 
the stringency of selection. During rounds 1 though 3, the reaction time was 1 hour; 
during round. 4, the reaction time was 20 minutes; and during round 5, it was 1 minute. 
The starting pool of single-stranded DNAs, together with the population of molecules 
obtained after each round of selection, was assayed for self-cleavage activity under 
conditions identical to those employed during in vitro selection (see Fig. 2). 

For this assay, the molecules were prepared with a 5*- 32 P rather than a 5*-biotin 
moiety, allowing detection of both the starting material and the 5' cleavage product. 
Following a 5-minute incubation, there was no detectable activity in the initial pool (GO) 
or in the population obtained after the first and second rounds of selection. DNAs 
obtained after the third round (G3) exhibited a modest level of activity; this activity 
increased steadily, reaching approximately 50% self-cleavage for the DNAs obtained 
after the fifth round of selection (G5). Cleavage was detected only at the target 
phosphoester, even after long incubation times. This activity was lost if Pb 2 * was 
omitted from the reaction mixture. 

Figure 2 illustrates the self-cleavage activity of the starting pool of DNA (GO) 
and populations obtained after the first through fifth rounds of selection (G1 - G5). 
Reaction mixtures contained 50 mM MgCI 2 , 0.5 M NaCI, 0.5 M KCI, 50 mM HEPES (pH 
7.0 at 23°C), and 3 nM [S'-^PHabeled DNA, incubated at 23°C for 5 min either in the 
presence or in the absence of 1 mM PbOAc. The symbol Pre represents 108-nucleotide 
precursor DNA (SEQ ID NO 4); Civ, 28-nucleotide 5'-cleavage product (SEQ ID NO 5); 
and M, primer 3a (SEQ ID NO 6), corresponding in length to the 5*-cleavage product. 

The 28-nucleotide 5* cleavage product (Civ) illustrated preferably has the 
sequence 5'-GGGACGAATTCTAATACGACTCACTATN-3*, wherein "N" represents 
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adenosine ribonucleotide with an additional 2', 3'-cyclic phosphate on the 3* end (SEQ 
ID NO 5). In alternative embodiments, "N" represents adenosine ribonucleotide with an 
additional 2' or 3* phosphate on the 3' end of the molecule. 

In Figure 2, the "GO" lane "Pre" band comprises a sampling of 108-nucleotide 
precursor DNAs that each include 50 random nucleotides. Therefore, any given "Pre" 
sampling will contain a wide variety of precursor DNAs, and each sampling will likely 
differ from previous and subsequent samplings. The "G1" through "G5" lanes contain 
"Pre" bands that are increasingly enriched for catalytic DNA molecules, but still contain 
a large number of different DNA sequences (i.e., differing in the 50 nucleotide 
randomized domain). A sample of these different sequences from "G5 Pre" DNA is 
provided in Figure 3. 

Shotgun ctoning techniques were employed to isolate individuals from the G5 
population; the complete nucleotide sequences of 20 of these subclones were then 
determined (see Fig. 3). (Also see, e.g., Cadwell and Joyce, in PCR Methods and 
Applications 2: 28-33 (1992); Cadwell and Joyce, PCR Methods and Applications 3 
(SuppU : S136-S140 (1994).) Of the 20 sequences, five were unique, two occurred 
twice, one occurred three times, and one occurred eight times. All of the individual 
variants share common sequence elements within the 50-nucleotide region that had 
been randomized in the starting pool of DNA. They all contain two presumed template 
regions, one with complementarity to a stretch of nucleotides that lies just upstream 
from the cleavage site and the other with complementarity to nucleotides that lie at 
least four nucleotides downstream. Between these two presumed template regions lies 
a variable domain of 1-11 nucleotides, followed by the fixed sequence 5'-AGCG-3\ then 
a second variable domain of 3-8 nucleotides, and finally the fixed sequence 5*-CG-3* or 
5'-CGA-3\ Nucleotides that lie outside of the two presumed template regions are highly 
variable in both sequence and length. In all of the sequenced subclones, the region 
corresponding to the 50 initially-randomized nucleotides remains a total of 50 
nucleotides in length. 

Figure 3 illustrates the sequence alignment of individual variants isolated from 
the population after five rounds of selection. The fixed substrate domain (5'- 
GGGACGAATTCTAATACGACTCACTATrAGGAAGAGATGGCGAC-3\ or 5'- 
GGGACGAATTCTAATACGACTCACTATNGGAAGAGATGGCGAC-3% where N represents 
adenosine ribonucleotide) (SEQ ID NO 13) is shown at the top, with the target 
riboadenylate identified with an inverted triangle. Substrate nucleotides that are 
commonly involved in presumed base-pairing interactions are indicated by a vertical bar. 
Sequences corresponding to the 50 initially-randomized nucleotides are aligned 
antiparallel to the substrate domain. All of the variants are 3'-terminated by the fixed 
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sequence B'-CGGTAAGCTTGGCAC-S' (SEQ ID NO 1) ("primer site"; not shown). 
Nucleotides within the initially-randomized region that are presumed to form base pairs 
with the substrate domain are indicated on the right and left sides of the Figure; the 
putative base-pair-forming (or substrate binding) regions of the enzymatic DNA 
5 molecules are individually boxed in each sequence shown. The highly-conserved 

nucleotides within the putative catalytic domain are illustrated in the two boxed 
columns. 

While it is anticipated that additional data will be helpful in constructing a 
meaningful secondary structural model of the catalytic domain, we note that, like the 

10 hammerhead and hairpin ribozymes, the catalytic domain of our enzymatic DNA 

molecules appears to contain a conserved core flanked by two substrate binding regions 
(or recognition domains) that interact with the substrate through base-pairing 
interactions. Similar to the hammerhead and hairpin ribozymes, the catalytic DNAs also 
appear to require a short stretch of unpaired substrate nucleotides in this case c , 

15 5'-GGA-3' - between the two regions that are involved in base pairing. 

It was also interesting to note that each of the nine distinct variants exhibited a 
different pattern of presumed complementarity with the substrate domain. In some 
cases, base pairing was contiguous, while in others it was interrupted by one or more 
noncomplementary pairs. The general tendency seems to be to form tighter interaction 

20 with the nucleotides that lie upstream from the cleavage site compared to those that lie 

downstream. Binding studies and site-directed mutagenesis analysis should enable us to 
gain further insights and to further substantiate this conjecture. 

In order to gain further insight into the sequence requirements for catalytic 
function, the self-cleavage activity of six of the nine variants was tested and evaluated 

25 under the within-described selection conditions (see Fig. 3). Not surprisingly, the 

sequence that occurred in eight of the 20 subclones proved to be the most reactive, 
with a first-order rate constant of 1.4 min*\ All of the studied variants were active in 
the self-cleavage assay and all gave rise to a single B'-labeled product corresponding to 
cleavage at the target RNA phosphoester. 

30 The dominant subclone was further analyzed under a variety of reaction 

conditions. Its self-cleavage activity was dependent on Pb 2+ but was unaffected if 
Mg 2+ was omitted from the reaction mixture. There was a requirement for a 
monovalent cation as well, which can be met by either Na + or K + . The reaction rate 
increased linearly with increasing concentration of monovalent cation over the range of 

35 O - 1 .0 M (r - 0.998). Other variables that may affect the reaction, such as pH, 

temperature, and the presence of other divalent metals, are in the process of being 
evaluated further. 
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Example 2 

Materials and Methods 

A. Oligonucleotides and Oligonucleotide Analogs 

Synthetic DNAs and DNA analogs were purchased from Operon Technologies. 
5 The 19-nucleotide substrate, 5'-pTCACTATrAGGAAGAGATGG-3' (or 5'- 

pTCACTATNGGAAGAGATGG-3\ wherein "N H represents adenosine ribonucleotide) 
(SEQ ID NO 7), was prepared by reverse-transcriptase catalyzed extension of 
5'-pTCACTATrA-3* (or 5'-pTCACTATN-3 f , wherein "N" represents adenosine 
ribonucleotide) (SEQ ID NO 8), as previously described (Breaker, Banerji, & Joyce, 
10 Biochemistry 33 : 11980-11986 (1994)), using the template 

5*-CCATCTCTTCCTATAGTGAGTCCGGCTGCA-3' (SEQ ID NO 9). Primer 3, 5'- 
GGGACGAATTCTAATACGACTCACTATrA-3' (or 5'- 

GGGACGAATTCTAATACGACTCACTATN-3' , wherein "N" represents adenosine 
ribonucleotide) (SEQ ID NO 6), was either 5'-labeled with [y- 32 P]ATP and T4 
15 polynucleotide kinase (primer 3a) or 5'-thiophosphorylated with [y-S]ATP and T4 

polynucleotide kinase and subsequently biotinylated with A/-iodoacetyl-/V'- 
biotinylhexylenediamine (primer 3b). 

B. DNA Pool Preparation 

The starting pool of DNA was prepared by PCR using the synthetic oligomer 
20 S^GTGCCAAGCTTACCG-Nso-GTCGCCATCTCTTCC-S' (SEQ ID NO 4), where N is an 

equimolar mixture of G, A, T and C. A 2-ml PCR, containing 500 pmoles of the 
randomized oligomer, 1,000 pmoles primer 1 (5-GTGCCAAGCTTACCG-3\ SEQ ID NO 
10), 500 pmoles primer 2 

(5*-CTGCAGAATTCTAATACGACTCACTATAGGAAGAGATGGCGAC-3\ SEQ ID NO 11), 
25 500 pmoles primer 3b, 10 mCi [a- 32 P]dATP, and 0.2 U Taq DNA polymerase, was 

incubated in the presence of 50 mM KCI, 1.5 mM MgCI 2 , 10 mM Tris-HCI (pH 8.3 at 
23°C), 0.01 % gelatin, and 0.2 mM of each dNTP for 1 min at 92°C, 1 min at 50°C, and 
2 min at 72°C, then 5 cycles of 1 min at 92°C, 1 min at 50°C, and 1 min at 72°C. The 
resulting mixture was extracted twice with phenol and once with chloroform / isoamyl 
30 alcohol, and the DNA was isolated by precipitation with ethanol. 

C. in Vitro Selection 

The starting pool of DNA was resuspended in 500 txL of buffer A (1 M NaCI and 
50 mM HEPES <pH 7.0 at 23°C)) and was passed repeatedly over a streptavidin column 
(AffiniTip Strep 20, Genosys, The Woodlands, TX). The column was washed with five 
35 100-^1 volumes of buffer A, followed by five 100-/il volumes of 0.2 N NaOH, then 

equilibrated with five IOO-/2I volumes of buffer B (0.5 M NaCI, 0.5 M KCI, 50 mM 
MgCI 2 , and 50 mM HEPES (pH 7.0 at 23°C)). The immobilized single-stranded DNA was 
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eluted over the course of 1 hr with three 20-^1 volumes of buffer B with added 1 mM 
PbOAc. The entire immobilization and elution process was conducted at 23°C. The 
eluant was collected in an equal volume of buffer C (50 mM HEPES (pH 7.0 at 23°C) 
and 80 mM EDTA) and the DNA was precipitated with ethanoL 

The resulting DNA was amplified in a 100-a<L PGR containing 20 pmoles primer 
1, 20 pmoles primer 2, 0.05 U Taq polymerase, 50 mM KCI, 1.5 mM MgCI 2 , 10 mM 
Tris-HCI (pH 8.3 at 23°C), 0.01 % gelatin, and 0.2 mM of each dNTP for 30 cycles of 
10 sec at 92°C, 30 sec at 50°C, and 30 sec at 72°C. The reaction products were 
extracted twice with phenol and once with chloroform / isoamyl alcohol, and the DNA 
was recovered by precipitation with ethanol. Approximately 4 pmoles of the amplified 
DNA was added to a second, nested PCR containing 100 pmoles primer 1, 100 pmoles 
primer 3b, 20 mCi [a- 32 P]dATP, and 0.1 U mI 1 Taq polymerase, in a total volume of 200 
ML that was amplified for 10 cycles of 1 min at 92°C / 1 min at 50°C, and 1 min at 
72°C. The PCR products were once more extracted and precipitated, and the resulting 
DNA was resuspended in 50 mL buffer A, then used to begin the next round of 
selection. 

The second and third rounds were carried out as above, except that the nested 
PCR at the end of the third round was performed in a 100-^1 volume. During the fourth 
round, the elution time following addition of Pb 2+ was reduced to 20 min (two 20-mL 
elution volumes) and only half of the recovered DNA was used in the first PCR, which 
involved only 15 temperature cycles. During the fifth round, the elution time was 
reduced to 1 min (two 20-^L elution volumes) and only one-fourth of the recovered DNA 
was used in the first PCR, which involved 15 temperature cycles. DNA obtained after 
the fifth round of selection was subcloned and sequenced, as described previously 
(Tsang & Joyce, Biochemistry 33 : 5966-5973 (1994)). 
D. Kinetic Analysis of Catalytic DNAs 

Populations of DNA and various subcloned individuals were prepared with a 
5'- 32 P label by asymmetric PCR in a 25-^1 reaction mixture containing 10 pmoles primer 
3a, 0.5 pmoles input DNA, and 0,1 U ^l' 1 Taq polymerase, under conditions as described 
above, for 10 cycles of 1 min at 92°C, 1 min at 50°C, and 1 min at 72°C. The 
resulting (5'- 32 P)-labeled amplification products were purified by electrophoresis in a 
10% polyacrylamide / 8 M gel. 

Self-cleavage assays were carried out following preincubation of the DNA in 
buffer B for 10 min. Reactions were initiated by addition of PbOAc to 1 mM final 
concentration and were terminated by addition of an equal volume of buffer C. Reaction 
products were separated by electrophoresis in a 10% polyacrylamide / 8 M gel. Kinetic 
assays under multiple-turnover conditions were carried out in buffer B that included 50 
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ml* 1 BSA to prevent adherence of material to the vessel walls. Substrate and enzyme 

molecules were preincubated separately for 5 min in reaction buffer that lacked Pb 2+ , 

then combined, and the reaction was initiated by addition of PbOAc to a final 

concentration of 1 mM. 

5 Example 3 

Evolution of Deoxvribozvmes 
That Cleave Intermolecularlv 

A. Conversion to an intermodular Format 

Based on the variable pattern of presumed base-pairing interactions between the 

10 catalytic and substrate domains of the studied variants, it was felt that it would be 

reasonably straightforward to convert the DNA-catalyzed reaction to an intermolecular 
format. In doing so, we wished to simplify the two substrate-binding regions of the 
catalyst so that each would form an uninterrupted stretch of 7-8 base pairs with the 
substrate. In addition, we wished to provide a minimal substrate, limited to the two 

15 base-pairing regions and the intervening sequence 5'-GGA-3' (Fig. 4A). 

Figures 4A and 4B illustrate DNA-catalyzed cleavage of an RNA phosphoester in 
an intermolecular reaction that proceeds with catalytic turnover. Figure 4A is a 
diagrammatic representation of the complex formed between the 1 9mer substrate and 
38mer DNA enzyme. The substrate contains a single adenosine ribonucleotide {"rA w or 

20 "N", adjacent to the arrow), flanked by deoxyribonucleotides. The synthetic DNA 

enzyme is a 38-nucleotide portion of the most frequently occurring variant shown in Fig. 
3. Highly-conserved nucleotides located within the putative catalytic domain are 
"boxed". As illustrated, one conserved sequence is "AGCG", while another is H CG H 
(reading in the 5'-3' direction). 

25 Figure 4B shows an Eadie-Hofstee plot used to determine K m (negative slope) 

and V m „ (y-intercept) for DNA-catalyzed cleavage of [5 f - 32 PJ-labeled substrate under 
conditions identical to those employed during in vitro selection. Initial rates of cleavage 
were determined for reactions involving 5 nM DNA enzyme and either 0.125, 0.5, 1, 2, 
or 4 ptM substrate. 

30 In designing the catalytic domain, we relied heavily on the composition of the 

most reactive variant, truncating by two nucleotides at the 5* end and 1 1 nucleotides at 
the 3* end. The 15 nucleotides that lay between the two template regions were left 
unchanged and a single nucleotide was inserted into the 3* template region to form a 
continuous stretch of nucleotides capable of forming base pairs with the substrate. The 

35 substrate was simplified to the sequence 5'- TCACTATrA • GGA AGAGATG G-3* (or 

5'- TCACTATN • GG AAGAGATG G-3' . wherein "N" represents adenosine ribonucleotide) 
(SEQ ID NO 12), where the underlined nucleotides correspond to the two regions 
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involved in base pairing with the catalytic DNA molecule. 

The simplified reaction system, employing a 38mer catalytic DNA molecule 
(catalyst) comprised entirely of deoxyribonucleotides and a 19mer substrate containing a 
single ribonucleotide embedded within an otherwise all-DNA sequence, allows efficient 
5 DNA-catalyzed phosphoester cleavage with rapid turnover. Over a 90-minute incubation 

in the presence of 0.01 a*M catalyst and 1 /*M substrate, 46% of the substrate is 
cleaved, corresponding to 46 turnovers of the catalyst. A preliminary kinetic analysis of 
this reaction was carried out, evaluated under multiple-turnover conditions. The DNA 
catalyst exhibits Michaetis-Menten kinetics, with values for k cat and K m of 1 min 1 and 2 

10 mM, respectively (see Fig. 4B). The value for K m is considerably greater than the 

expected dissociation constant between catalyst and substrate based on Watson-Crick 
interactions. The substrate was incubated under identical reaction conditions (but in the 
absence of the catalyst); a value for k uncat of 4 * 10 6 min* 1 was obtained. This is 
consistent with the reported value of 5 x 10 3 min* 1 for hydrolysis of the more labile 

15 1-nitrophenyl-1 ,2-propanediol in the presence of 0.5 mM Pb 2 * at pH 7.0 and 37°C 

(Breslow & Huang, PNAS USA 88 : 4080-4083 (1991)). 

It is now presumed that the phosphoester cleavage reaction proceeds via a 
hydrolytic mechanism involving attack by the ribonucleoside 2 '-hydroxyl on the vicinal 
phosphate, generating a 5' product with a terminal 2*(3')-cyclic phosphate and 3' 

20 product with a terminal 5*-hydroxyl. In support of this mechanism, the 3*-cleavage 

product is efficiently phosphorylated with T4 polynucleotide kinase and [y- 32 P]ATP, 
consistent with the availability of a free B'-hydroxyl (data not shown). 

B. Discussion 

After five rounds of in vitro selection, a population of single-stranded DNA 
25 molecules that catalyze efficient Pb 2+ -dependent cleavage of a target RNA phosphoester 

was obtained. Based on the common features of representative individuals isolated 
from this population, a simplified version of both the catalytic and substrate domains 
was constructed, leading to a demonstration of rapid catalytic turnover in an 
intermolecular context. Thus the 38mer catalytic domain provides an example of a DNA 
30 enzyme, or what might be termed a "deoxyribozyme". 

Referring to this molecule as an enzyme, based on the fact that it is an 
informational macromolecule capable of accelerating a chemical transformation in a 
reaction that proceeds with rapid turnover and obeys Michaelis-Menten kinetics, may 
not satisfy everyone's notion of what constitutes an enzyme. Some might insist that an 
35 enzyme, by definition, must be a polypeptide. If, however, one accepts the notion of an 

RNA enzyme, then it seems reasonable to adopt a similar view concerning DNA 
enzymes. Considering how quickly we were able to generate this molecule from a pool 
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of random-sequence DNAs, we expect that many other examples of synthetic DNA 
enzymes will appear in the near future. 

The Pb 2+ -dependent cleavage of an RNA phosphoester was chosen as an initial 
target for DNA catalysis because it is a straightforward reaction that simply requires the 
5 proper positioning of a coordinated Pb 2+ -hydroxyl to facilitate deprotonation of the 2 ' 

hydroxyl that lies adjacent to the cleavage site. (See, e.g., Pan, et al., in The RNA 
World . Gesteland & Atkins (eds.), pp. 271-302, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY (1993).) Pb 2 + is known to coordinate to the N7 position of 
purines, the 06 position of guanine, the 04 position of uracil, and the N3 position of 

10 cytosine (Brown, et al., Nature 303 : 543-546 (1993)). Thus, the differences in sugar 

composition and conformation of DNA compared to RNA seemed unlikely to prevent 
DNA from forming a well-defined Pb 2 + -binding pocket. 

A substrate that contains a single ribonucleotide within an otherwise all-DNA 
sequence was chosen because it provided a uniquely favored site for cleavage and 

1 5 insured that any resulting catalytic activity would be attributable solely to DNA. 

Substrate recognition appears to depend on two regions of base-pairing interactions 
between the catalyst and substrate. However, the unpaired substrate nucleotides, 
5*-GGA-3\ that lie between these two regions may play an important role in substrate 
recognition, metal coordination, or other aspects of catalytic function. 

20 It is further anticipated that an all-RNA molecule, other RNA-DNA composites, 

and molecules containing one or more nucleotide analogs may be acceptable substrates. 
As disclosed herein, the within-described in vitro evolution procedures may successfully 
be used to generate enzymatic DNA molecules having the desired specificities; further 
analyses along these lines are presently underway. 

25 In addition, studies to determine whether the presumed base-pairing interactions 

between enzyme and substrate are generalizable with respect to sequence are in 
progress, using the presently-described methods. The within-disclosed Pb 2 + -dependent 
deoxyribozymes may also be considered model compounds for exploring the structural 
and enzymatic properties of DNA. 

30 The methods employed in the present disclosure for the rapid development of 

DNA catalysts will have considerable generality, allowing us to utilize other cofactors to 
trigger the cleavage of a target linkage attached to a potential catalytic domain. In this 
regard, the development of Mg 2+ -dependent DNA enzymes that specifically cleave 
target RNAs under physiological conditions is of interest, as is the development of DNA 

35 enzymes that function in the presence of other cations (see Example 4). Such 

molecules will provide an alternative to traditional antisense and ribozyme approaches 
for the specific inactivation of target mRNAs. 
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DNA thus joins RNA and protein on the list of biological macromolecufes that are 
capable of exhibiting enzymatic activity. The full extent of DNA's catalytic abilities 
remains to be explored, but these explorations should proceed rapidly based on in vitro 
selection methods such as those employed in this study. 
5 DNA enzymes offer several important advantages compared to other 

macromolecular catalysts. First, they are easy to prepare, in an era when most 
laboratories have access to an automated DNA synthesizer and the cost of DNA 
phosphoramidites has become quite modest. Second, they are very stable compounds, 
especially compared to RNA, thus facilitating their use in biophysical studies. Third, we 

10 expect that they can be adapted to therapeutic applications that at present make use of 

antisense DNAs that lack RNA-cleavage activity. In vitro selection could be carried out 
with DNA analogs, including compounds that are nuclease resistant such as 
phosphorothioate-containing DNA, so long as these analogs can be prepared in the form 
of a deoxynucleoside 5'-triphosphate and are accepted as a substrate by a ^ 

15 DNA-dependent DNA polymerase. Finally, DNA enzymes offer a new window on our 

understanding of the macromolecular basis of catalytic function. It will be interesting, 
for example, to carry out comparative analyses of protein-, RNA-, and DNA-based 
enzymes that catalyze the same chemical transformation. 

Example 4 

20 Other Families of Catalytic DNAs 

A starting pool of DNA was prepared by PCR essentially as described ]n Example 
2.B. above, except that the starting pool of DNA comprised molecules containing 40 
random nucleotides. Thus, the starting pool of DNA described herein was prepared by 
PCR using the synthetic oligomer 5 ' GGG ACG AAT TCT AAT ACG ACT CAC TAT rA 
25 GG AAG AGA TGG CGA CAT CTC N^GT GAC GGT AAG CTT GGC AC 3 ' (SEQ ID NO 

23), where N is an equimolar mixture of G, A, T and C, and where the DNA molecules 
were selected for the ability to cleave the phosphoester following the target rA. (See 
Figure 6A, also.) 

Selective amplification was carried out in the presence of either Pb 2 + ,Zn 2+ ,Mn 2 +, 
30 or Mg 2 + , thereby generating at least four "families* of catalytic DNA molecules. As 

illustrated in Figure 5, catalytic DNA molecules demonstrating specific activity were 
generated in the presence of a variety of cations. 

Figure 5 is a photographic representation showing a polyacrylamide gel 
demonstrating specific endoribonuclease activity of four families of selected catalytic 
35 DNAs. Selection of a Pb 2 + -dependent family of molecules was repeated in a side-by- 

side fashion as a control. In each group of three lanes, the first fane shows the lack of 
activity of the selected population in the absence of the metal cation, the second lane 



WO 96/17086 PCT/US95/15580 



-44- 

shows the observed activity in the presence of the metal cation, and the third lane 
shows the lack of activity of the starting pool (GO). At present, the order of reactivity is 
observed to be Pb 2 * >Zn 2 + >Mn 2 + >Mg 2+ , mirroring the pK 8 of the corresponding metal- 
hydroxide. 

5 After either five (G5) or six (G6) rounds of selective amplification in the presence 

of the preselected divalent cation, the desired endonuclease activity was obtained. The 
following description of selective amplification in the presence of Mg 2 + is intended to be 
exemplary. 

Six rounds of in vitro selective amplification were carried out, following the 
10 method described in Example 2 hereinabove, except that the divalent metal used was 1 

mM Mg 2+ rather than 1 mM Pb 2 ** (See also Breaker and Joyce, Chem. & Biol. 1 : 
223-229 (1994), incorporated by reference herein, which describes essentially the same 
procedure.) 

Individual clones were isolated following the sixth round, and the nucleotide 
1 5 sequence of 24 of these clones was determined. All of the sequences began with: 5 ' 

GGG ACG AAT TCT AAT ACG ACT CAC TAT rA GG AAG AGA TGG CGA CA (SEQ ID 
NO 23 from position 1 to 44) and ended with: CGG TAA GCT TGG CAC 3 ' (SEQ ID 
NO 23 from position 93 to 107). 

The segment in the middle, corresponding to TCTC N 40 GTGA (SEQ ID NO 23 
20 from position 45 to 92) in the starting pool, varied as follows: 

(13) CCG CCC ACC TCT TTT ACG AGC CTG TAC GAA ATA GTG CTC TTG 

TTA GTA T (SEQ ID NO 24) 
(5) TCT CTT CAG CGA TGC ACG CTT GTT TTA ATG TTG CAC CCA TGT 
25 TA G TGA (SEQ ID NO 25) 

(2) TCT CAT CAG CGA TTG AAC CAC TTG GTG GAC AGA CCC ATG TTA 

GTG A (SEQ ID NO 26) 
(1) CCG CCC ACC TCT TTT ACG AGC CTG TAC GAA ATA GTG TTC TTG 
TTA GTA T (SEQ ID NO 27) 
30 (1) CCG CCC ACC TCT TTT ACG AGC CTG TAC GAA ATA GTG CTC TCG 

TTA GTA T (SEQ ID NO 28) 
(1 ) TCT CAG ACT TAG TCC ATC ACA CTC TGT GCA TAT GCC TGC TTG 

ATG TGA (SEQ ID NO 29) 
(1) -CT CTC ATC TGC TAG CAC GCT CGA ATA GTG TCA GTC GAT GTG A 
35 (SEQ ID NO 30). 
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The initial number in parentheses indicates the number of clones having that 
particular sequence. Note that some mutations (highlighted in bold type) occurred at 
nucleotide positions other than those that were randomized initially. 

The second sequence listed above (i.e., SEQ ID NO 25), which occurred in 5 of 
5 24 clones, was chosen as a lead (i.e. principal) compound for further study. Its 

cleavage activity was measured in the presence of a 1 mM concentration of various 
divalent metals and 1 M NaCI at pH 7.0 and 23°C: 



metal k^, (min* 1 ) 

10 none n.d. 

Mg 2+ 2.3x10 3 

Mn 2 + 6.8 x 10 3 

Zn 2+ 4.2 x10' 2 

Pb 2+ 1.1 x 10 2 

15 

Thus t the lead compound is active in the presence of all four divalent metals, 
even though it was selected for activity in the presence of Mg 2+ . Conversely, DNA 
molecules that were selected for activity in the presence of Mn 2+ , Zn 2 + , or Pb 2 * did not 
show any activity in the presence of Mg 2+ . 

20 In addition, the population of DNAs obtained after six rounds of in vitro selection 

in the presence of Mg 2 + , when prepared as all-phosphorothioate-containing DNA 
analogs, showed Mg 2 * -dependent cleavage activity at an observed rate of - 10 3 min \ 
The phosphorothioate-containing analogs were prepared enzymatically so as to have an 
/? P configuration at each stereocenter. Such compounds are relatively resistant to 

25 degradation by cellular nucleases compared to unmodified DNA. 

The lead compound was re-randomized at 40 nucleotide positrons (underlined), 
introducing mutations at a frequency of 15% (5% probability of each of the three 
possible base substitutions). The re-randomized population was subjected to seven 
additional rounds of in vitro selection. During the last four rounds, molecules that were 

30 reactive in the presence of 1 mM Pb 2 * were removed from the population before the 

remainder were challenged to react in the presence of 1 mM Mg 2 + . Individual clones 
were isolated following the seventh round and the nucleotide sequence of 14 of these 
clones was determined. All of the sequences began with: 5 ' GGG ACG AAT TCT AAT 
ACG ACT CAC TAT rA GG AAG AGA TGG CGA CAT CTC (SEQ ID NO 23, from position 

35 1 to 48), and ended with: GTG ACG GTA AGC TTG GCA C 3 ' (SEQ ID NO 23, from 

position 89 to 107). 

The segment in the middle, corresponding to the 40 partially-randomized 
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positions (N^, SEQ ID NO 23, from position 49 to 88), varied as follows: 



10 



15 



20 



(4) TAC AGC GAT TCA 

(SEQ ID NO 31) 
(2) ATC AGC GAT TAA 

(SEQ ID NO 32) 
(2) TTC AGC GAT TAA 

(SEQ ID NO 33) 
(1) ATC AGC GAT TCA 

(SEQ ID NO 34) 
(1) ATC AGC GAT TCA 

(SEQ ID NO 35) 
(1) ATC AGC GAT TCA 

(SEQ ID NO 36) 
(1) ATC AGC GAT TAA 

(SEQ ID NO 37) 
(1) ATC AGC GAT TAA 

(SEQ ID NO 38) 
(1) ATC AGC GAT TAA 

(SEQ ID NO 39). 



CCC TTG TTT AAG GGT TAC ACC CAT GTT A 
CGC TTG TTT CAA TGT TAC ACC CAT GTT A 
CGC TTA TTT TAG CGT TAC ACC CAT GTT A 
CCC TTG TTT TAA GGT TGC ACC CAT GTT A 
CCC TTG TTT AAG CGT TAC ACC CAT GTT G 
CCC TTG TTT TAA GGT TAC ACC CAT GTT A 
CGC TTA TTT TAG CGT TAC ACC CAT GTT A 
CGC TTG TTT TAG TGT TGC ACC CAT GTT A 
CGC TTA TTT TAG CAT TAC ACC CAT GTT A 



The number in parentheses indicates the number of clones having that particular 
sequence. Nucleotides shown in bold are those that differ compared to the lead 
compound. 

25 Formal analysis of the cleavage activity of these clones is ongoing. The 

population as a whole exhibits Mg 2+ -dependent cleavage activity at an observed rate of 
— 10' 2 min\ with a comparable level of activity in the presence of Pb 2 + . 

Figures 6A and 6B provide two-dimensional illustrations of a "progenitor" 
catalytic DNA molecule and one of several catalytic DNA molecules obtained via the 

30 selective amplification methods disclosed herein, respectively. Figure 6A illustrates an 

exemplary molecule from the starting pool, showing the overall configuration of the 
molecules represented by SEQ ID NO 23. As illustrated, various complementary 
nucleotides flank the random (N^) region. 

Figure 6B is a diagrammatic representation of one of the Mg 2 + -dependent 

35 catalytic DNA molecules (or "DNAzymes") generated via the within-described 

procedures. The location of the ribonucleotide in the substrate nucleic acid is indicated 
via the arrow. (The illustrated molecule includes the sequence identified herein as SEQ 
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ID NO 25, as well as "beginning" and "ending" sequences of SEQ ID NO 23.) 

Endonuclease activity is continuing to be enhanced in each of the 
aforementioned "families" via in vitro evolution, as disclosed herein, so it is anticipated 
that enzymatic DNA molecules of increasingly desirable specificities may be generated 
5 successfully using the within-disclosed guidelines. 

Example 5 
Cleavage of Larger RNA Sequences 
As an extension of the foregoing, we have developed DNA enzymes that cleave 
an all-RNA substrate, rather than a single ribonucleotide embedded within an otherwise 
10 all-DNA substrate as demonstrated above. (Also see R.R. Breaker & G.F. Joyce, Chem. 

& Biol. 1 : 223-229 (1994); R.R. Breaker & G.F. Joyce, Chem. & Biol. 2: 655-660 
(1995)). As a target sequence, we chose a stretch of 12 highly-conserved nucleotides 
within the U5 LTR region of HIV-1 RNA, having the sequence 
5* GUAACUAGAGAU 3* (SEQ ID NO 49). 
15 Following the methods described in the previous examples, we generated a pool 

of 1014 DNA molecules that have the following composition: 

5'- GGAAAA r(GUAACUAGAGAU) GGAAGAGATGGCGAC N so 
CGGTAAGCTTGGCAC -3' (SEQ ID NO 50), 
where N is an equimolar mixture of the deoxyribonucleotides G, A, T, and C, and where 
20 the sequence identified as "r(GU AACU AGAG AU) * is comprised of /vfconucleotides. 

(Optionally, one may alter the initial 5' nucleotide sequence, e.g., by adding an 
additional dA residue to the sequence preceding the ribonucleotide portion at the 5' end, 
thus causing the initial sequence to read "GGAAAAA" and causing SEQ ID NO 50 to be 
99 residues in length. Clearly, this is but one example of the modifications that may be 
25 made in order to engineer specific enzymatic DNA molecules, as disclosed in detail 

herein.) 

The enzymatic DNA molecules thus produced were selected for their ability to 
cleave a phosphoester that lies within the embedded RNA target sequence. Ten rounds 
of in vitro selective amplification were carried out, based on the enzymatic DNA 

30 molecules' activity in the presence of 10 mM Mg 2+ at pH 7.5 and 37°C. During the 

selection process, there was competition for "preferred" cleavage sites as well as for the 
"best" catalyst that cleaves at each such preferred site. Two sites and two families of 
catalysts emerged as possessing the most efficient cleavage capabilities (see Fig. 7). 
Figure 7 illustrates some of the results of ten rounds of in vitro selective 

35 amplification carried out essentially as described herein. As shown, two sites and two 

families of catalysts emerged as displaying the most efficient cleavage of the target 
sequence. Cleavage conditions were essentially as indicated in Fig. 7, namely, 10mM 
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Mg 2+ ( pH 7.5, and 37°; data collected after the reaction ran for 2 hours is shown. 
Cleavage (%) is shown plotted against the number of generations (here, 0 through 10). 
The number/prevalence of catalytic DNA molecules capable of cleaving the target 
sequence at the indicated sites in the substrate is illustrated via the vertical bars, with 
cleavage at G 1 UAACU AGAGAU shown by the striped bars, and with cleavage at 
GUAACUA1GAGAU illustrated via the open (lightly-shaded) bars. In Figure 7, as herein, 
the arrow (i) indicates the site between two neighboring nucleotides at which cleavage 
occurs. 

Various individuals from the population obtained after the 8th and 10th rounds 
of selective amplification were cloned. The nucleotide sequences of 29 individuals from 
the 8th round and 32 individuals from the 10th round were then determined (see Tables 
2 and 3, respectively). 

Under the heading "Nucleotide Sequence" in each of Tables 2 and 3 is shown 
the portion of each identified clone that corresponds to the 50 nucleotides that were 
randomized in the starting pool (i.e., Ngo); thus, the entire nucleotide sequence of a 
given clone generally includes the nucleotide sequences preceding, following, and 
including the - N 50 * segment, presuming the substrate sequence is attached and that 
self-cleavage has not occurred. For example, the entire sequence of a (non-self-cleaved) 
clone may generally comprise residue nos. 1-33 of SEQ ID NO 50, followed by the 
residues representing the randomized N w region, followed by residue nos. 84-98 of SEQ 
ID NO 50, or by residue nos. 1-34 of SEQ ID NO 51, followed by the residues 
representing the randomized N 50 region, followed by residue nos. 85-99 of SEQ ID NO 
51 . It is believed, however, that the N 50 (or N^) region -- or a portion thereof -- of each 
clone is particularly important in determining the specificity and/or activity of a particular 
enzymatic DNA molecule. This is particularly evident in reactions in which the substrate 
and the DNAzyme are separate molecules (see, e.g., Figs. 8 and 9). 

Clone numbers are designated as 8-x or 10-x for individuals obtained after the 
8th or 10th rounds, respectively. SEQ ID NOS are also listed and correspond to the 
"Ngo" region of each clone. 
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Table 2 

Cloned Individuals from 8th Round of Amplification 

Clone SEQ 

5 No. ID NO "N.„ ■ Nucleotide Sanuence <5'~3') 

8-2 52 CCA ATA QTG CTA CTG TGT ATC TCA ATG CTG GAA ACA CGG GTT 
ATC TCC CG 

8-4 53 CCA AAA CAG TGG AGC ATT ATA TCT ACT CCA CAA AGA CCA CTT 
TTC TCC CG 

10 8-5 1 54 ATC CGT ACT AGC ATG CAG ACA GTC TGT CTG CTT TTT CAT TAC 

TCA CTC CC 

8-1 4 55 CAA TTC ATG ATG ACC AAC TCT GTC AAC ACG CGA ACT TTT AAC 
ACT GGC A 

8-1 7 2 56 CTT CCA CCT TCC GAG CCG GAC GAA GTT ACT TTT TAT CAC ACT 
1 5 ACG TAT TG * 

8-3 57 GGC AAG AGA TGG CAT ATA TTC AGG TAA CTG TGG AGA TAC CCT 
GTC TGC CA 

8-6 58 CTA GAC CAT TCA CGT TTA CCA AGC TAT GGT AAG AAC TAG AAT 
CAC GCG TA 

20 8-8 59 CGT ACA CGT GGA AAA GCT ATA AGT CAA GTT CTC ATC ATG TAC 

CTG ACC GC 

8-10 60 CAG TGA TAC ATG AGT GCA CCG CTA CGA CTA AGT CTG TAA CTT 
ATT CTA CC 

8-22 61 ACC GAA TTA AAC TAC CGA ATA GTG TGG TTT CTA TGC TTC TTC 
25 TTC CCT GA 

8-11 62 CAG GTA GAT ATA ATG CGT CAC CGT GCT TAC ACT CGT TTT ATT 
AGT ATG TC 

8-21 63 CCC TAC AAC ACC ACT GGG CCC AAT TAG ATT AAC GCT ATT TTA 
TAA CTC G 

30 8-1 2 64 CCA AAC GGT TAT AAG ACT GAA AAC TCA ATC AAT AGC CCA ATC 

CTC GCC C 

8-1 3 65 CAC ATG TAT ACC TAA GAA ATT GGT CCC GTA GAC GTC ACA GAC 
TTA CGC CA 

8-23 66 CAC AAC GAA AAC AAT CTT CCT TGG CAT ACT GGG GAG AAA GTC 
35 TGT TGT CC 

8-40 67 CAC ACG AAC ATG TCC ATT AAA TGG CAT TCC GTT TTT CGT TCT 
ACA TAT GC 
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8-24 68 CAG AAC GAG GGT CTT GTA AGA CTA CAC CTC CTC AGT GAC AAT 
AAT CCT G 

8-26 69 CAC TAC AGC CTG ATA TAT ATG AAG AAC AGG CAA CAA GCT TAT 
GCA CTG G 

5 8-27 70 GGG TAC ATT TAT GAT TCT CTT ATA AAG AGA ATA TCG TAC TCT 

TTT CCC CA 

8-28 71 CCA AAG TAC ATT CCA ACC CCT TAT ACG TGA AAC TTC CAG TAG 
TTT CCT A 

8-29 72 CTT GAA GAT CCT CAT AAG ACG ATT AAA CAA TCC ACT GGA TAT 
10 AATCCGGA 

8-34 73 CGA ATA GTG TCC ATG ATT ACA CCA ATA ACT GCC TGC CTA TCA 
TGT TTA TG 

8-35 74 CCA AGA GAG TAT CGG ATA CAC TTG GAA CAT AGC TAA CTC GAA 
CTG TAC CA 

1 5 8-36 75 CCA CTG ATA AAT AGG TAA CTG TCT CAT ATC TGC CAA TCA TAT 

GCC GTA 

8-37 76 CCC AAA TTA TAA ACA ATT TAA CAC AAG CAA AAG GAG GTT CAT 
TGC TCC GC 

8-39 77 CAA TAA ACT GGT GCT AAA CCT AAT ACC TTG TAT CCA AGT TAT 
20 CCT CCC CC 

1 identical to 10-4, 10-40 

2 identical to 8-20, 8-32, 8-38, 10-1, 10-34; 1 mutation to 10-11; 3 mutations 
25 to 10-29 

Table 3 

Cloned individuals from 10th Round of Amplification 

30 Clone SEQ 

-Ufl, IP NO IbUo" Nucleotide Sequence t5'-3') 

10-3 3 78 CCG AAT GAC ATC CGT AGT GGA ACC TTG CTT TTG ACA CTA AGA 
AGC TAC AC 

10-10 79 CCA TAA CAA ATA CCA TAG TAA AGA TCT GCA TTA TAT TAT ATC 
35 GGT CCA CC 

10-12 80 CAG AAC AAA GAT CAG TAG CTA AAC ATA TGG TAC AAA CAT ACC 
ATC TCG CA 

10-14 81 CCT TTA GTT AGG CTA GCT ACA ACG ATT TTT CCC TGC TTG GCA 
ACG ACA C 
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10-15 82 CTC CCT ACG TTA CAC CAG CGG TAC GAA TTT TCC ACG AG A GGT 
AAT CCG CA 

10-1 9 83 CGG CAC CTC TAG TTA GAC ACT CCG GAA TTT TTC CCC 
10-39 84 CGG CAC CTC TAG TTA GAC ACT CCG GAA TTT TAG CCT ACC ATA 
5 GTC CGG T 

10-23 85 CCC TTT GGT TAG GCT AGC TAC AAC GAT TTT TCC CTG CTT GAA 
TTG TA 

1 0-27* 86 CCC TTT GGT TAG GCT AGC TAC AAC GAT TTT TCC CTG CTT GAC 
CTG TTA CGA 

10 1 0-31 87 CCT TTA GTT AGG CTA GCT ACA ACG ATT TTT CCC TGC TTG GAA 

CGA CAC 

10-18 88 CAT GGC TTA ATC ATC CTC AAT AGA AGA CTA CAA GTC GAA TAT 
GTC CCC CC 

10-20 89 CAA CAG AGC GAG TAT CAC CCC CTG TCA ATA GTC GTA TGA AAC 
1 5 ATT GGG CC 

1 0-6 90 TAC CGA CAA GGG GAA TTA AAA GCT AGC TGG TTA TGC AAC CCT 
TTT CGC A 

10-7 91 CTC GAA ACA GTG ATA TTC TGA ACA AAC GGG TAC TAC GTG TTC 
AGC CCC C 

20 10-8 92 CCA ATA ACG TAA CCC GGT TAG ATA AGC ACT TAG CTA AGA TGT 

TTA TCC TG 

10-16 93 CAA TAC AAT CGG TAC GAA TCC AGA AAC ATA ACG TTG TTT CAG 
AAT GGT CC 

10-21 94 GCA ACA ACA AGA ACC AAG TTA CAT ACA CGT TCA TCT ATA CTG 
25 AAC CCC CA 

10-24 95 CCT TTG AGT TCC TAA ATG CCG CAC GGT AAG CTT GGC ACA CTT 
TGA CTG TA 

1 0-28 96 CAA AGA TCT CAC TTT GG A AAT GCG AAA TAT GTA TAT TCG CCC 
TGT CTG C 

30 1 0-33 97 CCA CGT AGA ATT ATC TGA TTT ATA ACA TAA CGC AGG ATA ACT 

CTC GCC CA 

10-35 98 CAC AAG AAA GTG TCG TCT CCA GAT ATT TGA GTA CAA GGA ACT 
ACG CCC 

1 0-36 99 CAT GAA GAA ATA GGA CAT TCT ACA GGC TGG ACC GTT ACT ATG 
35 CCT GTA GG 

1 0-37 1 00 CAT AGG ATA ATC ATG GCG ATG CTT ATG ACG TGT ACA TCT ATA 
CCT T 



t. 
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10-38 101 CAG ATG ATC TTC CTT TAA AGA CTA CCC TTT AAA GAA ACA TAA 
GGT ACC CC 

3 1 mutation to 10-5 

4 1 mutation to 10-30 

The self-cteavage activity of various clones was subsequently measured. Clones 
8-5, 8-17, and 10-3 were found to cleave efficiently at the site 5' GUAACU1 AGAGAU 
3\ while clones 10-14, 10-19 and 10-27 were found to cleave efficiently at the site 5' 
G1UAACUAGAGAU 3\ When the RNA portion of the molecule was extended to the 
sequence 5* GGAAAAAGUAACUAGAGAUGGAAG 3' (residue nos. 1-24 of SEQ ID NO 
51), clones 8-17, 10-14, and 10-27 retained full activity, while clones 8-5, 10-3, and 
10-19 showed diminished activity. Subsequently, clone 10-23 was found to exhibit a 
high level of activity in the setf-cleavage reaction involving the extended RNA domain. 

It should also be noted, in the event one of skill in the relevant art does not 
appreciate same, that the nucleotide sequences preceding and following the "N 50 " 
segments of the polynucleotide molecules engineered according to the teachings of the 
present invention disclosure may be altered in a variety of ways in order to generate 
enzymatic DNA molecules of particular specificities. For example, while residue nos. 1- 
24 of SEQ ID NO 51 are described herein as RNA nucleotides, they may alternatively 
comprise DNA, RNA, or composites thereof. (Thus, for example, SEQ ID NO 51 could 
easily be altered so that nucleic acid residue nos. 1-7 would comprise DNA, residue nos. 
8-19 would comprise RNA, residue nos. 20-99 would comprise DNA, and so on.) 
Similarly, the nucleotides following the "N 50 " region may comprise RNA, DNA, or 
composites thereof. The length of the regions preceding and following the "N 50 " (or 
"N^" — see Example 4) region(s) may also be varied, as disclosed herein. Further, 
sequences preceding and/or following N w or N^ regions may be shortened, expanded, 
or deleted in their entirety. 

Moreover, as noted above, we selected a specific region of HIV-1 RNA as the 
target sequence in the methods described in this Example; such a sequence is not the 
only sequence one may use as a target. Clearly, one of skill in the relevant art may 
follow our teachings herein to engineer and design enzymatic DNA molecules with 
specificity for other target sequences. As disclosed herein, such target sequences may 
be constructed or inserted into larger sequences comprising DNA, RNA, or composites 
thereof, as illustrated by SEQ ID NOS 50 and 51. 

The self-cleavage reaction was easily converted to an intermolecular cleavage 
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reaction by dividing the enzyme and substrate domains into separate molecules. Clones 
8-17 and 10-23 were chosen as prototype molecules. Both were shown to act as DNA 
enzymes in the cleavage of a separate all-RNA substrate in a reaction that proceeds with 
multiple turnover (Fig. 8). The substrate binding arms were subsequently reduced to 7 
5 base-pairs on each side of the unpaired nucleotide that demarcates the cleavage site 

(Fig. 9). 

Figure 8 illustrates the nucleotide sequences, cleavage sites, and turnover rates 
of two catalytic DNA molecules of the present invention, clones 8-17 and 10-23, 
Reaction conditions were as shown, namely, 10mM Mg 2 + , pH 7.5, and 37°C. The 

10 DNAzyme identified as clone 8-17 is illustrated on the left, with the site of cleavage of 

the RNA substrate indicated by the arrow. The substrate sequence (5' - 
GGAAAAAGUAACUAGAGAUGGAAG - 31 - which is separate from the DNAzyme (i.e., 
intermolecular cleavage is shown) — is labeled as such. Similarly, the DNAzyme 
identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 

15 substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 

17 enzyme, the turnover rate was approximately 0.6 hr _1 ; for the 10-23 enzyme, the 
turnover rate was approximately 1 hr* 1 . 

As illustrated in Fig. 8, the nucleotide sequence of the clone 8-17 catalytic DNA 
molecule capable of cleaving a separate substrate molecule was as follows: 

20 B'-CTTCCACCTTCCGAGCCGGACGAAGTTACTTTTT-S' (residue nos. 1-34 of SEQ ID 

NO 56). In that same figure, the nucleotide sequence of the clone 10-23 catalytic DNA 
molecule capable of cleaving a separate substrate molecule was as follows: 
5'-CTTTGGTTAGGCTAGCTACAACGATTTTTCC-3* (residue nos. 3-33 of SEQ ID NO 
85). 

25 Figure 9 further illustrates the nucleotide sequences, cleavage sites, and 

turnover rates of two catalytic DNA molecules of the present invention, clones 8-17 and 
10-23. Reaction conditions were as shown, namely, 10mM Mg 2 + , pH 7.5, and 37°C. 
As in Fig. 8, the DNAzyme identified as clone 8-17 is illustrated on the left, with the site 
of cleavage of the RNA substrate indicated by the arrow. The substrate sequence (5' - 

30 GGAAAAAGUAACUAGAGAUGGAAG - 3') -which is separate from the DNAzyme (i.e., 

intermolecular cleavage is shown) « is labeled as such. Similarly, the DNAzyme 
identified herein as 10-23 is shown on the right, with the site of cleavage of the RNA 
substrate indicated by the arrow. Again, the substrate sequence is indicated. For the 8- 
17 enzyme, k obs was approximately 0.002 min' 1 ; for the 10-23 enzyme, the value of k ob$ 

35 was approximately 0.01 min*\ 

As illustrated in Fig. 9, the nucleotide sequence of the clone 8-17 catalytic DNA 
molecule capable of cleaving a separate substrate molecule was as follows: 
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5 ' -CCACCTTCCGAGCCGGACGAAGTTACT- 3' Residue nos. 4-30 of SEQ ID NO 56). In 
that same figure, the nucleotide sequence of the clone 10-23 catalytic DNA molecule 
capable of cleaving a separate substrate molecule was as follows: 

5' -CTAGTTAGGCTAGCTACAACGA I I I I I CC-3' (residue nos. 5-33 of SEQ ID NO 85, 
5 with "CTA" substituted for "TTG" at the 5 f end). 

The catalytic rate of the RNA-cleaving DNA enzymes has yet to be fully 
optimized. As disclosed above and as reported in previous studies, we have been able 
to improve the catalytic rate by partially randomizing the prototype molecule and 
carrying out additional rounds of selective amplification. We have found, however, that 
1 0 the K m for Mg 2+ is approximately 5 mM and 2 mM for the 8-17 and 10-23 DNA 

enzymes, respectively, measured at pH 7.5 and 37°C; this is certainly compatible with 
intracellular conditions. 



The foregoing specification, including the specific embodiments and examples, is 
1 5 intended to be illustrative of the present invention and is not to be taken as limiting. 

Numerous other variations and modifications can be effected without departing from the 
true spirit and scope of the present invention. 
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(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/472,194 
<B) FILING DATE: 07-JUN-1995 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/349,023 

(B) FILING DATE: 02-DEC-1994 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Logan, April C. 

(B) REGISTRATION NUMBER: 33,950 

(C) REFERENCE / DOCKET NUMBER: 463.2 PC 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619) 554-2937 

(B) TELEFAX: (619) 554-6312 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 15 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
CGGTAAGCTT GGCAC 15 
20 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

30 (ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 

(B) LOCATION: replace (8, ,,M ) 

(D) OTHER INFORMATION: /s tandard_name= "ADENOSINE 
RIBONUCLEOTIDE » 

35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
TCACTATNAG GAAGAGATGG 2 0 

5 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
ACACATCTCT GAAGTAGCGC CGCCGTATAG TGACGCTA 3 8 

20 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

GTGCCAAGCT TACCGNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 6 0 
NNNNNGTCGC CATCTCTTCC 80 



35 
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(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE : DNA (genomic) 



<ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 28 

15 (D) OTHER INFORMATION: /standard_name= "2' 3' CYCLIC 

PHOSPHATE" 

(ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 
20 <B) LOCATION: replace (28, "") 

(D) OTHER INFORMATION: /standard_name= "ADENOSINE 
RIBONUCLEOTIDE" 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



GGGACGAATT CTAATACGAC TCACTATN 28 



30 



(2) INFORMATION FOR SEQ ID NO : 6 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



WO 96/17086 



PCT/US95/15580 



■59- 



(ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 

(B) LOCATION: replace (28, " " ) 

<D) OTHER INFORMATION: /standard_name= "ADENOSINE 
RIBONUCLEOTIDE " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 



10 



GGGACGAATT CTAATACGAC TCACTATN 



28 



(2) INFORMATION FOR SEQ ID NO: 7 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



25 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: misc_diff erence 

(B) LOCATION: replace (8, " " ) 

(D) OTHER INFORMATION: / s t anda r d_name = "ADENOSINE 
RIBONUCLEOTIDE" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



30 



TCACTATNGG AAGAGATGG 



19 



(2) INFORMATION FOR SEQ ID NO: 8; 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
<ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY : misc_dif f erence 

(B) LOCATION: replace ( 8 , «■•) 

(D) OTHER INFORMATION: / s t anda r d_name = "ADENOSINE 
NUCLEOTIDE " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



TCACTATN 8 
15 (2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

CCATCTCTTC CTATAGTGAG TCCGGCTGCA 3 0 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GTGCCAAGCT TACCG 15 

(2) INFORMATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 3 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CTGCAGAATT CTAATACGAC TCACTATAGG AAGAGATGGC GAC 4 3 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 

(B) LOCATION: replace (8, ) 

(D) OTHER INFORMATION: /standard_name=* "ADENOSINE 
RIBONUCLEOTIDE " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



TCACTATNGG AAGAGATGG 



19 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: misc_dif f erence 

(B) LOCATION: replace (28, »' " ) 

(D) OTHER INFORMATION: /standard_name= "ADENOSINE 
RIBONUCLEOTIDE " 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

GGGACGAATT CTAATACGAC TCACTATNGG AAGAGATGGC GAC 4 3 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
TCACACATCT CTGAAGTAGC GCCGCCGTAT GTGACGCTAG GGGTTCGCCT 50 
(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 
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